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Preface 


This volume was prepared in conjunction with a Symposium held in Berkeley 
June 5—7, 2005, as a tribute to Professor Pravin Varaiya. The contributions represent 
most of the lectures given at the meeting. The Symposium brought together former 
students, collaborators and friends from throughout the world to celebrate Pravin’s 
career as he approached the memorable occasion of his 65th birthday. The authors, 
speakers, organizers, supporters and attendees of the Symposium are very pleased to 
dedicate this work to Pravin, to congratulate him on his many seminal contributions, 
and to thank him for his leadership in the fields of systems, control and networks 
over the past four decades. 

Pravin Varaiya was born on October 29, 1940 in Bombay, India. He earned the 
B.E. degree in Electrical Engineering from the University of Bombay in 1960, and 
then began his graduate studies at the University of California, Berkeley. The early 
1960s was an exciting time during which the foundations of systems and control were 
developed, and Berkeley contributed to this development through the research of 
Professors Arthur Bergen, Charles Desoer, Mac Hopkin, Eli Jury, Elijah Polak, Otto 
Smith, and Lotfi Zadeh. Professor Eugene Wong joined the faculty in 1963 and con- 
tributed to the understanding of stochastic systems. Berkeley attracted outstanding 
visiting faculty, including Moshe Zakai and Bill Root. The faculty trained and men- 
tored a strong group of graduate students, including Mike Athans, Dick Mortensen, 
Jack Wing, Jim Eaton, Cesare Galtieri, Barry Whalen, and Pravin Varaiya. 

Pravin was a Member of the Technical Staff of Bell Laboratories during 1962- 
1963, following completion of his M.S. at Berkeley. He and Ruth Kosh were married 
on June 30, 1963, while Pravin was at Bell Labs. Through the years, Ruth would be 
a constant supporter of Pravin and his work. 

Pravin joined the faculty of the Department of Electrical Engineering and Com- 
puter Sciences at U.C. Berkeley upon completing his Ph.D. in 1966. From the begin- 
ning, his work was marked by creativity, rigor, timeliness and impact. These quali- 
ties alone are enough to launch an outstanding career in academia. However, Pravin 
demonstrated something more: an unusual breadth of interest and ability that ex- 
tended beyond his original research area of control and optimization to include com- 
munication and information theory, stochastic processes, game theory, and circuit 
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and network theory. In each area that he entered, Pravin made important and lasting 
contributions. He quickly moved through the academic ranks, gaining the rank of 
Professor in 1970, and became well known and highly regarded in the field at an 
early age. 

His interests continued to expand, building on a solid foundation in system and 
control theory and mathematics. In the early 1970s, Pravin started his research in 
economics, focusing mainly on issues of urban economics, such as the design of rent 
control, urban land use, and the economics of home ownership, but also contribut- 
ing to general economic theory. This entry into economic research was followed by 
his appointment as Professor of Economics at Berkeley in 1975. His teaching and 
research duties would be split between electrical engineering and economics until 
1992, when he decided to again focus his full attention on engineering. 

Later in the 1970s and into the 1980s, Pravin steadily increased his activities in 
the area of communications, with an emphasis on communication networks, and si- 
multaneously began a research effort in the field of electric power systems, focusing 
on dynamics and control of nonlinear power system models. Under his direction, 
teams of research students and visitors made important contributions to the under- 
standing of many important issues in these areas. In the 1980s he also began research 
efforts in discrete event systems and hybrid systems, as well as in pricing issues both 
for communication network services and for electric power. In the late 1980s he 
became involved in what was to become a major commitment with the Institute of 
Transportation Studies at Berkeley and the California PATH Program (Partners for 
Advanced Transit and Highways), a multi-university research program dedicated to 
the solution of California’s transportation problems. He made seminal contributions 
to the design of intelligent vehicle highway systems, building on his past research in 
large scale, multilayer, and hybrid systems. From 1994 to 1997 he was Director of 
the PATH program. 

Looking back over four decades, the contributions of Pravin Varaiya are diffi- 
cult to summarize in a few pages of this book. He has published extensively, having 
authored or co-authored four books and more than 280 technical papers. His pa- 
pers and books are lucid and have influenced many researchers, undergraduate and 
graduate students, and practitioners. He has also served on the Board of Directors of 
several technology companies, and has personally been involved in technology trans- 
fer through involvement in start-up companies. In addition to his research, teaching, 
and writing activities, Pravin has consistently managed to also maintain a significant 
level of public service by working for human rights causes around the world. 

During leaves of absence from Berkeley, Pravin has held visiting appointments at 
the Federal University of Rio de Janeiro (Fall 1970) and MIT (January 1974—January 
1975). 

Pravin Varaiya has been recognized with many awards and distinctions. He has 
held a Guggenheim Fellowship (1972) and a Miller Research Professorship (July 
1978—June 1979). He holds an Honorary Doctorate from l’Institut National Poly- 
technique de Toulouse. In 2002 he was awarded the IEEE Field Medal in Control 
Systems for “outstanding contributions to stochastic and adaptive control and the 
unification of concepts from control and computer science.” He is a Fellow of the 
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IEEE and a member of the National Academy of Engineering. He is currently Nortel 
Networks Distinguished Professor in the Department of Electrical Engineering and 
Computer Sciences at the University of California, Berkeley. He has served on the 
editorial boards of many prominent journals, and is currently on the editorial boards 
of Discrete Event Dynamical Systems, Transportation Research—C, and the Journal 
of Economic Dynamics and Control. 

As mentioned earlier, this volume represents most, but not all, of the presen- 
tations at the Symposium. The chapters are written by experts, and each is at the 
forefront of research or presents a careful review and assessment of an important 
topic. Since it was not feasible to cover all the areas in which Pravin has contributed 
over the years, emphasis was placed on the areas of his greatest current interest. 
The chapters are thus broadly classified into four categories: I. Hybrid Systems; II. 
System Theory and Design; III. Networks; and IV. Transportation. Connections with 
the work of Pravin Varaiya are evident in all of the contributions, and many authors 
chose to make these connections explicitly. 

Part I consists of three chapters on hybrid systems. The first chapter, by Hwang, 
Stipanović and Tomlin, extends numerical methods for reachability analysis devel- 
oped for linear systems to feedback linearizable systems, linear dynamic games, and 
norm-bounded nonlinear systems. The chapter by Kurzhanski solves a problem of 
measurement-based feedback control of systems with unknown but bounded uncer- 
tainties using an ellipsoidal approximation technique developed with Varaiya. The 
third chapter, by Piazza and Mishra, develops a concept of functional hybrid au- 
tomata for use in biological system models, then proceeds to reduce these models to 
differential equation models and to study their stability. 

Part II also contains three chapters. Davis gives a survey of representation of 
martingales as stochastic integrals, with applications to special classes of stochastic 
processes and to mathematical finance. The second chapter is by Lee, who discusses 
combining computing and engineering through redesign of the systems part of the 
undergraduate curriculum. The third chapter, by Deshpande, outlines some of the 
emerging themes in system design for network equipment, emphasizing systems for 
which throughput performance is the primary concern. 

Part III consists of seven chapters on networks. Gastpar uses information-theoretic 
bounds to extend to sensor networks a result of Varaiya and Walrand on the effec- 
tiveness of feedback in a causal coding context. Liu and Goldsmith consider the joint 
design of a wireless network and networked controllers, and illustrate the framework 
they introduce through cross-layer optimization of the link layer, MAC layer, and 
sample period selection of a double inverted pendulum system. Garg, Borkar and 
Manjunath consider the pricing of network resources to control demand behavior 
during periods of congestion in the internet. Gupta and Walrand propose a novel 
backoff mechanism for ad-hoc networks and demonstrate its tendency to improve 
fairness. Baras and Jiang discuss a distributed cooperative game-theoretic framework 
for trust establishment in distributed networks such as mobile ad-hoc networks, sen- 
sor networks and ubiquitous computing systems. Johari and Tsitsiklis consider net- 
work settings such as communication networks and power systems, and study the 
design of market mechanisms that minimize efficiency loss and are robust to gam- 
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ing behavior of market participants. Stoenescu and Teneketzis present a mechanism 
design theory view of decentralized network resource allocation. 

Part IV contains four chapters on transportation systems. Shladover discusses 
automated highway systems research with a focus on Pravin Varaiya’s influence on 
the field. Kotsialos and Papageorgiou use optimal control and simulation to study the 
potential of freeway network ramp metering control to reduce congestion. The third 
chapter, by Mahmassani and Zhou, develops a state-space model for real-time trip 
demand pattern estimation and prediction, along with optimal updating algorithms 
and application to real network data. The fourth chapter, by Horowitz, Muñoz and 
Sun, presents a new traffic-responsive on-ramp switched control strategy as well as 
test results obtained using a traffic simulator. 

On behalf of all the contributors to this Festschrift volume and the participants 
in the associated Symposium, the members of the Organizing Committee would like 
to congratulate Pravin Varaiya on his uniquely successful career to-date, and to wish 
him many more years of continued success. Mostly, however, the contributors and 
organizers would simply like to wish him well on his 65th birthday, and to sincerely 
thank him for his leadership in research, and for being such a dedicated mentor, 
colleague and friend of the many students and research collaborators who have been 
privileged to work with him. 


College Park, MD E.H. Abed 
Stanford, CA A. Goldsmith 
Berkeley, CA R. Horowitz 
Urbana, IL PR. Kumar 
Berkeley, CA S.S. Sastry 


March 2005 Symposium Organizing Committee 
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Summary. This chapter presents applications of polytopic approximation methods for reach- 
able set computation using dynamic optimization. The problem of computing exact reachable 
sets can be formulated in terms of a Hamilton—Jacobi partial differential equation (PDE). Nu- 
merical solutions which provide convergent approximations of this PDE have computational 
complexity which is exponential in the continuous variable dimension. Using dynamic opti- 
mization and polytopic approximation, computationally efficient algorithms for overapprox- 
imative reachability analysis have been developed for linear dynamical systems [1]. In this 
chapter, we extend these to feedback linearizable nonlinear systems, linear dynamic games, 
and norm-bounded nonlinear systems. Three illustrative examples are presented. 


1.1 Introduction 


Reachability analysis for continuous and hybrid systems is important for the au- 
tomatic verification of safety properties and for the synthesis of safe controllers for 
these systems [2, 3]. Convergent approximations of reachable sets for such systems 
can be computed by solving a particular Hamilton-Jacobi partial differential equa- 
tion (PDE) [3, 4]. Numerical methods have been devised to compute these convergent 
overapproximations [5], which work well in up to four to five continuous variable di- 
mensions, yet these methods are not practical for solving high dimensional problems. 
Therefore, approximate methods for reachable set computation have been proposed. 

Tiwari and Khanna [6] and Alur et al. [7] proposed predicate abstraction for 
reachable set computation: this method can be used to extract equivalent finite state 


* This research was supported by DARPA under the Software Enabled Control Program 
(AFRL contract F33615-99-C-3014), by ONR under MURI contract N00014-02-1-0720, 
and by an NSF Career Award (ECS-9985072). 
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models from complex, infinite state models, which are used to find approximate 
reachable sets of the original systems. In [8], Hwang et al. have used an augmented 
form of predicate abstraction to compute reachable sets for a simple biological cell 
network. However, since the accuracy of reachability analysis using predicate ab- 
straction greatly depends on the choice of polynomials for abstraction, it is important 
to have information about a given system a priori (from analysis and simulations) 
to get good results in the reachability analysis. Chutinan and Krogh [9, 10] present 
a method to approximate the flows of autonomous systems with convex polyhedra. 
An experimental system called d/dt [7, 11, 12] has been developed to approximate 
reachable sets for linear dynamical systems using griddy orthogonal polyhedra. Ideas 
based on projecting the initial or target set into a lower dimensional subset of the 
state space, performing the reach set computation in the lower dimensional space, 
and then back projecting to form an overapproximation of the actual reachable set in 
the full state space, are presented in [13, 14]. In all of these methods, however, it is 
difficult to compute the control input which is guaranteed to keep the system on the 
boundary or inside the set, from the boundary of the overapproximative set. 

Varaiya [1] has designed, using techniques from optimal control theory, a poly- 
topic approximation for linear systems. Kostousova [15] has developed two-sided 
approximations of reachable sets for linear dynamic systems using parallelotopes. 
Kurzhanski and Varaiya [16, 17] proposed an ellipsoidal approximation for forward 
and backward reachable sets (a computational tool VeriSHIFT [18] has been devel- 
oped based on their ideas) and in [19, 20], they define various types of reachable 
sets for linear time-varying systems with bounded perturbations using both open 
and closed-loop input laws. In [20], they propose ellipsoidal overapproximations of 
reachable sets for linear systems under uncertainty via solutions of a particular type 
of differential equation. In [21, 22], the authors have extended reachable set compu- 
tations to general nonlinear systems with state constraints and obstacles, using non- 
standard Hamilton—Jacobi equations and variational inequalities. Overall, this sem- 
inal work in exact and approximate reachable set calculation suggests new research 
directions in computational methods for such problems. This work was indeed moti- 
vation for the current chapter. 

In this chapter, we review the method proposed by Varaiya [1] to compute reach- 
able sets for linear time invariant systems. Inspired by Kurzhanski and Varaiya 
[16, 17, 19, 20] and by the work of Khrustalev [23], we compute approximate reach- 
able sets for feedback linearizable nonlinear systems, linear dynamic games, and 
norm-bounded nonlinear systems. We present three examples, one of which is a two- 
aircraft three-dimensional collision avoidance example which we have used in other 
work [5]. 

This chapter is organized as follows. Motivation for this study is described in 
Section 1.2. Computations of polytopic reachable sets for linear dynamical systems, 
feedback linearizable nonlinear systems, linear dynamic games, and norm-bounded 
nonlinear systems are presented in Section 1.3. Examples are presented in Section 
1.4. Conclusions are presented in Section 1.5. 
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1.2 Background and Motivation 


Consider a dynamical system, 


TEC) | 
x(0) E€ Xo (or x(t) € Vo), tE (0, tr] (1.1) 


where 0 < ty < œ, g£ E R”, u € U C R” is the control input, d € D C RP 
is the disturbance input, Xo = {x : I(x) < 0} is an initial set of states, and Vo = 
{x : y(x) < 0} is a target set of states. We assume f to be Lipschitz. The spaces of 
admissible control input trajectories and disturbance input trajectories are denoted as 
the spaces of piecewise continuous functions U = {u(-) € PC®|u(t) € U,0<t< 
tr}and D = {d(-) € PC®|d(t) € D,0 < t < ty} respectively. The forward and the 
backward reachable sets of the system (1.1) are defined as follows. 


Definition 1. The forward reachable set X(T) at time T (0 < T < tf), of the system 
(1.1) from the initial set Xo, is the set of all states x(T), such that there exists a 
control input u(t) E U (0 < t < T), forall disturbance inputs d(t) ED (0< t< 7), 
for which x(T) is reachable from some x(0) € X(0), along a trajectory satisfying 
(1.1). 


Definition 2. The backward reachable set Y(T) at time T (0 < T < tf), of the system 
(1.1) from the target set Vo, is the set of all states x(T), such that there exists a control 
input u(t) EU (T < t < tf), for all disturbance inputs d(t) E€ D (T < t < tf), for 
which some x(t) € Vo are reachable from xz(T), along a trajectory satisfying (1.1). 


It has been shown that a forward reachable set computation can be formulated as a 
dynamic optimization problem [17, 23]. The forward reachable set of the dynamical 
system (1.1) at time 7 (0 < 7 < ty) is shown to be [17]: 


X(T) = {x : v(x, T) < OF (1.2) 


where v(x,7) is a (viscosity) solution of the Hamilton—Jacobi—Isaacs (HJI) partial 
differential equation, 


Dyv(a, t) + max min{ < Dzul t) f(z, u,d) >} =0 (1.3) 


with v(x,0) = U(x), < p,q >= pq the inner product in R”, and where Do repre- 
sents the partial derivative with respect to the subscripted variable. Thus, the forward 
reachable set of the dynamical system (1.1) is the zero sublevel set of the solution to 
the HJI equation in (1.3). 

Similarly, the backward reachable set of the dynamical system (1.1) at time 7 
(0 < T < tf) is the zero sublevel set of the solution to the HJI equation [17], 


Dyv(a,t) + min max{< D,v(2,t), f(z, u,d) >} =0 (1.4) 


with u(x,t) = y(x). 
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In [4, 5], a numerical tool for computing convergent approximations for back- 
wards reachable sets is designed and presented. This method is based on the level 
set method for computing solutions to PDEs [24]. The computational complexity of 
this tool is exponential in the number of continuous variables dimensions: it has 
been shown to work well in up to four or five continuous variables dimensions, 
yet for larger problems computation time is currently prohibitive. Numerical con- 
vergence has been demonstrated on several examples; we will use a “benchmark” 
three-dimensional example from [5] in this chapter. 

Consider planar kinematic models of two aircraft, labeled 1 and 2. Let the relative 
position and orientation of aircraft 2 with respect to aircraft 1 be represented by 
(£r, Yr, Yr) E€ R? x [-7m,7). Given the absolute positions and orientations of the 
two aircraft, denoted as x;, y;, Y; for i = 1, 2, the relative coordinates are defined as: 
Tr = cos Yı (x2 = £1) + sin Yı (y2 = y1), Yr = — sin %1 (£2 = ti) + cos Yı (y2 = 
Y1), Yr = W2 — Yı. The relative kinematics are thus given by: 


Lp = —01 + 02 COS Yr + WYP, 
Yr = 02 sin Wr Wty, (1.5) 
Wr = w2 — W, 


where o; is the linear velocity of aircraft 7 and w; is its angular velocity. Safety is 
encoded as a 5 nautical mile radius cylinder “protected zone” centered at the origin 
of the relative frame. In this chapter, following the notation in Definition 2 (which 
is different from that in [5]), we define the angular velocity of aircraft 2 (w2) as the 
control input that steers the system (1.5) into the target set and the angular velocity 
of aircraft 1 (w1) as the disturbance input that keeps the system (1.5) outside of the 
target set. Posing this problem as a game, we label aircraft 1 as “evader” and aircraft 2 
as “pursuer”, and we compute the set of states (x,., Yr, Yr) for which for all possible 
disturbance inputs, w; action of the evader, there is a control input, wz action of the 
pursuer, such that the system state enters the protected zone, which we consider the 
target set of the game. For values 0) = o2 = 5 and w; € [—1,1] @ € {1,2}), 
the problem has been solved numerically, and the results (solid surface) are shown 
in Fig. 1.4 (Courtesy of I. Mitchell [5]). This computation took approximately 4 
minutes to run on a Sun UltraSparc II, in which 50 grid nodes in each dimension 
were used. 

A version of this example may also be solved analytically [25], and it may be 
verified using this that the average error in computation is less than one tenth of a 
grid cell, with maximum error always less than one grid cell. 

In the following section, we extend Varaiya’s method [1] to treat this kind of 
system and in Section 1.4, we compare the computation above with the resulting 
approximation. 


1.3 Computation of polytopic reachable sets 


We first define the overapproximate reachable set [17] (here we specialize to 
the case of (1.1) in which there are no disturbances). Assume that x,(0) € Xo 
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and u.(t) € U for all t > O such that x.(7) € V(r) (0 < t < 7T). Then, an 
overapproximate solution to the solution of the HJI equation in (1.3) is defined as a 
function ut (x,t) satisfying [17, 23]: 
+ 
A lea. (ucula feu) 

= Dyvt (xx, t)+ < Devt (ax, t), f (Ex, Ux) > 

< Diwt (a.,t) + maxycu{< Drv (xx, t), f(ax,u) >} 

< u(t) 








(1.6) 


where v? (x,,¢) is a piecewise continuous function, and u(t) is a positive-definite, 
integrable function. By integrating (1.6) from 0 to 7, we obtain an overapproximative 
reachable set of the dynamical system (1.1) at time 7 as: 


Vt(r) = {alut(a,7r) < | u(t)dt + max vt(a(0),0)}. (1.7) 
0 «(0)EXo 
Next, we review the polytopic overapproximation of reachable sets for linear 
dynamical systems and derive computational methods for polytopic overapproximate 
reachable sets for feedback linearizable nonlinear systems, linear dynamic games, 
and norm-bounded nonlinear systems. 


1.3.1 Linear dynamical systems 


In this section, we review the polytopic overapproximation of reachable sets for 
linear systems from [1]. Consider a time-varying linear dynamical system 


#(t) = A(t)a(t) + B(t)u(t), x0) € Xo, u(t) CU (1.8) 


where the initial set Xo and the admissible control input set U are assumed to be 
convex polytopes which have N and N, faces respectively. In this chapter, we as- 
sume the initial set Xo is a polytope, but in general the number of faces of the initial 
set is a design parameter since Xo may be a convex compact set and thus the more 
the number of faces of Xo the better the overapproximate reachable set. 

A convex polytope P with K faces can be represented in two ways; it can be 
represented as the bounded intersection of K half spaces, 


K 
P = falhe <r} 9) 
i=1 


where h; is a normal vector to the it” face of the polytope P. A convex polytope can 
also be represented as the convex hull of its vertices: if a convex polytope P has m 
vertices {u!,...,u’"}, then 


Pa{ele=> aw, of >0, >> ai=1} (1.10) 
i=l 4=1, 
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Define a set of linear functions as 
+ _— er . 
vu; (x,t) = h; (t)z, ie {1,2,..., N}. (1.11) 


These linear functions are used to represent a convex polytope as shown in (1.9). 
In order to find a polytopic overapproximate reachable set, we solve for vF (x,t) in 
(1.11) that satisfies (1.6). Then, (1.6) becomes 


Diw? (a, t) + maxucu{< Dzv} (x,t), f(£,u) >} 
=< hilt), x(t) > + < A(t)TA,(t), x(t) > +maxyeu{< hilt), B(t)u(t) >} 
< p(t). (1.12) 


From optimal control theory [26], the adjoint equation for linear systems when the 
input set does not depend on x is A(t) = —A(t)7 A(t). If we choose h;(t) = A(t) 
(i € {1,2,..., N}), then 


< hilt) a(t) > + < A)T hilt), e(t) >= 0. (1.13) 


This represents the evolution of the normal vector of the it” face. Let h;(0), i € 
{1,2,..., N} be the normal vectors of the faces of the initial set Xo. Then, the 


solution to (1.13) is p(t) = H(t, 0)h,(0), i € {1,2,...,N} (1.14) 


where &(t, 0) is the state transition matrix satisfying 6 = —A(t)™, (0,0) = I. 
If the system dynamics in (1.8) is time invariant, then &(t,0) = e~A"t and (1.14) 


becomes hilt) =e ni, i € {1,2,..., N}. (1.15) 


Thus, for a linear time invariant system, the evolution of normal vectors can be de- 
termined analytically. We denote {ut}, ..., u™! } as the vertices of the input set U. 
Since U is a convex polytope, the following must hold: (for j € {1,...,m,}) 


max < hj(t), B(t)u(t) >= max < hi(t), B(t)u! >< p(t) (1.16) 
u kj 


that is, the maximum is achieved at a vertex of U [1]. Furthermore, if the system 
dynamics in (1.8) is time invariant, (1.16) is simplified to 


max < h;(t), Bu’ >= max < e74 th;(0), Bu? >< p(t) (1.17) 
J J 


for j € {1,..., Mu}. We choose u(t) = max; < h;(t), B(t)u? > and note that u(t) 
is always positive for a properly chosen input set U (e.g., chosen such that 0 € U). 
Then, the linear function vr (x,t) in (1.11) is a supporting hyperplane of the exact 
reachable set [1]. A polytopic overapproximate forward reachable set V+ (t) for the 
dynamical system (1.8) is the intersection of half spaces as follows: 


V+(t) = figs z: vF (x,t) < i max; < hi(s), B(s)u? > ds 


1.18 
+ MaxXz(0)€ Xo ve (x(0),0)}. ( ) 
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The set V+ (t) is a convex polytope which contains the exact reachable set at time t 
since each ve (x,t) in (1.18) is a supporting hyperplane of the exact reachable set. If 
the system dynamics is linear time invariant, V * (t) becomes 


t -ATs F 
V+(t) = (i meus (ot) <= Jo max; <e~“ *h,(0), Bu > ds (1.19) 
+MAXy(0)EX, Vi (x(0), 0) }. 


1.3.2 Feedback linearizable nonlinear systems 


In this section, we consider a class of nonlinear systems [27], in which u(t) is a 

feedback control: 
where i(t) = f(x) + g(a)u(t) (1.20) 
u(t) = a(a(t)) + b(x(t))v(Z). (1.21) 


We assume that there exists a diffeomorphism T: such that z = T(x), which trans- 
forms, with a control input u(t), a nonlinear system (1.20) into an equivalent linear 
system [27]. Then, we can compute an overapproximate forward reachable set for 
the nonlinear system (1.20) as follows: 


e Step 1: Transform the nonlinear system (1.20) to an equivalent linear system, 
2(t) = A(t)z(t) + B(t)v(t) with appropriate u(t) and T. 

e Step 2: Compute a polytopic overapproximate forward reachable set V*(t) of 
the linear system following the procedure in Section 1.3.1. 

e Step 3: Using the inverse state transformation x = T~1+(z), we obtain the over- 
approximate forward reachable set for the original nonlinear system (1.20) from 
VT). 


Since there is no approximation during the transformation and the transformation is a 
diffeomorphism on a given domain of interest, the forward reachable set obtained in 
Step 3 is guaranteed to be an overapproximate forward reachable set of the nonlinear 
system (1.20). 


1.3.3 Linear dynamic games 
Now, we consider the linear dynamic game: 


a(t) = A(t)a(t) + B(t)u(t) + C(t)d(t), (1.22) 
x(0) € Xo, u(t) € U, d(t) € D . 
where the initial set Xo, the admissible control input set U, and the disturbance 
input set D are assumed to be convex polytopes which have N, Nu, and Na faces 
respectively. Then, the HJI equation in (1.3) for a forward reachable set computation 
becomes [19, 20], 


Dy,v(a, t) + maxyey mingep{< Dzv(z, t), 


A(t)z(t) + B(t)u(t) + C(t)d(t) >} = 0. (1.23) 
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To find an overapproximate solution to (1.23), we look for a set of linear functions 
v7 (x,t) in (1.11) satisfying (1.13), and compute 


Dwf (x,t) +maxyev mingep{< Dav} (x,t), 
( 





A(t)a(t) + B(t)u(t) + C(t)d(t) >} 
ee Ow Noes (1.24) 
+ mingep{< hi(t), C(t)d(t) >} 
< p(t). 
We denote {u?,..., wu} and {d',...,d'4} as the vertices of U and D respec- 


tively. Since (1.24) is linear with respect to u and d, the maximum and the minimum 
in (1.24) are achieved at vertices of U and D as follows: 


max < h;(t), B(t)u? > +min < h;(t), C(t)d* >< u(t) (1.25) 
J 


for j € {1,..., mu}, k € {1,...,ma}. 

By choice of u(t) = max; < hj(t), B(t)uw? > +ming < h(t), C(t)d* >, the 
polytopic overapproximate reachable set V+ (t) for the linear dynamic game (1.22) 
is 


N t 
= BiG : v7 (x,t) zi u(s)ds+ max v;*(x(0),0)}. (1.26) 


x(O)EXo 


1.3.4 Norm-bounded nonlinear systems 


In this section, we consider a norm-bounded nonlinear system, 


i(t) = A(Ha(t) + Bult) + (z, t), oe 
z(0) € Xo, u(t) €U, de, All < BO) ? 


where the initial set Xo and the admissible control input set U are assumed to be con- 
vex polytopes which have N and N, faces respectively. || - || represents the Euclidean 
norm; (3(-) is a positive-definite function. Then, the HJI equation in (1.3) becomes 


Dyv(a,t) + max{ < D,v(a,t), A(t)a(t) + B(t)u(t) + (x,t) >} =0. (1.28) 


To compute an oie preset solution to the HJB equation in (1.28), we find the 
linear functions v; H(t, t) in (1.11) satisfying (1.13), and compute 


Dyv; (x,t) +maxueu{< Dzv; (x,t), A(t)x(t) + B(t)u(t) + (x,t) >} 
= masser hilt), BE a > < uta t) > 
S maxuev{< MO BO)u) >} + 9 allla? + lee, tl?) 
< maxj{< hi(t), Bu >} + 3 Lhe Wl + 6(4)?) 
< u(t). 
(1.29) 
If we choose u(t) such that 
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ee s hi(t), B(t)u! > +5 sll Hl + 8’), (1.30) 


then a polytopic overapproximate reachable set V * (t) for the norm-bounded dynam- 
ical system (1.27) is 


Vie Ay see Ue < Jol max; < hi(s), B(s)u? > 


+5 ([hi(s cee s))Jds + max, ex, v7 (#(0), 0)}. (1.31) 


If (x,t) belongs to a polytope with vertices {¢1,...,¢™*}, a polytopic overap- 
proximate reachable set V+ (t) becomes 


V+) = NÈ {r : vt (a,t) < fo imax; < hi(s), B(s)ui > 


+ max;{< hj(s), o" >}]ds + max,(o)e xo vj (x(0), 0)}. (1.32) 


1.4 Examples 


We consider three examples: a linear system, a norm-bounded nonlinear system, 
and we conclude with the example which motivated this study, a nonlinear, feedback 
linearizable, dynamic game. Note that equation (1.7) provides overapproximations of 
the sets of reachable states over a range of times (the flow). In the implementation, 
we compute overapproximations of the reachable sets at specific instants of time 
without interpolation between the sets. 


1.4.1 Linear dynamical systems 


In this section, we consider a linear dynamical system t = Ax + Bu, «(0) € Xo 
where the control input u(t) can vary inside a convex polytope U and the initial 
set Xo is also a convex polytope. The system parameters (A, B, Xo, and U) given 
in [11] are used. Fig. 1.1 shows the evolution of the projection on x3 and x4 over 
time. This result is similar to that in [11], yet computation time with the method 
shown in Section 1.3.1 is 1.17 seconds (which includes plotting the result shown 
in Fig. 1.1) using MATLAB on a 700MHz Pentium III PC. For comparison, the 
algorithm proposed in [11] takes 18 seconds using the same parameters. 


1.4.2 Norm-bounded nonlinear systems 
We consider a norm-bounded nonlinear system 
t = A(t)x + B(t)u(t) + o(2,t), x(0) € Xo, u(t) € U (1.33) 


where the initial set Xo and the control input set U are convex polytopes. The non- 
linear function (x,t) is assumed to be norm-bounded i.e., ||d(, t) || < $t where 
t > 0. The system parameters are defined as follows: 
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4D example: projected onto X% and x 


4 
3.5 


2.5 


time 


0.5 











Fig. 1.1. The forward reachable set of a four-dimensional linear dynamical system (projection 
onto z3 and x4). 


-0.5 4.0 -1 
sal a Bal 
Xo = [4,5] x [4,5], U = [-0.1,0.1]. 


The evolution of the forward reachable set over time is shown in Fig. 1.2 and its 
computation time is 0.87 seconds (including plotting the result) using MATLAB on 
the same PC. 


1.4.3 Conflict resolution between two aircraft 


Last, we consider the two aircraft collision avoidance problem, as an example of 
feedback linearizable nonlinear systems and linear dynamic games. This is the same 
problem (the motivation for this research) described in Section 1.2. Fig. 1.3 shows 
the relative configuration between two aircraft showing the protected zone. 

Aircraft 1 tries to avoid a conflict with aircraft 2 within the limits of its capability. 
Thus, we want to compute a backward reachable set (unsafe set) from the target set 
(protected zone). The target set represents the states from which the two aircraft 
would eventually have a conflict no matter how aircraft 1 tries to avoid it [5]. 

Using dynamic extension [27] with g; as a new state variable (compared to (1.5)), 
we obtain a new nonlinear model which is feedback linearizable [28], 


time 
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Reach set for normbounded nonlinear system: x‘ = Ax + Bu + (x,t) 


10 6 y 


Fig. 1.2. The forward reachable set of a norm-bounded nonlinear system. 


Ti Oi COS Yi 
Yi oisin Yi i 
ds = Í w; í , (i € {1,2}), (1.34) 
7 a 
0; ay 
Wr 
Polytopic approximation of A y, 6 


the protected zone (Y,) 2 























protected zone 





Fig. 1.3. Relative configuration of two aircraft showing the protected zone. 
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where a; is the acceleration of aircraft 7 and is a new control input. Thus, the new 
state and input variables are €; := [xi yi Yi oi] and n; := [a; wi]? respectively. We 
introduce a change in state variables, z; = T(€;), and a change of the input variables, 
ni = M(&;)uj, as in [28]. We denote that T and M are diffeomorphisms everywhere 
except at o; = 0. Then, the feedback linearized model of the nonlinear kinematic 
aircraft model in (1.34) obtained through the transformations T and M is [28]: 


_ or 
(OE; 


with A and B as defined in [28]. 


& > ži = Azı + Bu; (1.35) 


ži 


View Azimuth 80°, Elevation 10° 








Fig. 1.4. Comparison between overapproximate (grid) and exact (solid) backward reachable 
sets (unsafe sets) of conflict resolution between two aircraft. 


The relative kinematic aircraft model between two aircraft can be obtained by 
introducing new states €,. := > — €, in the original nonlinear state space and z, := 
Z2 — 21 in the linearized state space. Thus, a linearized relative kinematic aircraft 
model is 

Žr = Aztp + Bug — Buy, u2 €U, u E€ D, (1.36) 


where the admissible control input set U and the disturbance input set D are poly- 
topes. This is a linear dynamic game since aircraft 1 (u1) tries to keep aircraft 2 from 
entering into its protected zone (target set) to prevent a conflict, but aircraft 2 (u2) 
tries to enter the protected zone of aircraft 1. A target set (protected zone) is assumed 
to be Yo = [—5,5] x [-5,5] x [-7, 7]. Using dynamic extension, we have per- 
formed the computation in four dimensions (1.36) and projected the result onto the 
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relative coordinate in three-dimensional space. A polytopic overapproximate back- 
ward reachable set is first computed in the linearized space, and then the overap- 
proximate backward reachable set in the original state space is obtained through the 
transformations T and M. The overapproximate backward reachable set for conflict 
resolution with heading changes only, using the target set Vo, normalized aircraft 
speeds cı = o2 = 5, angular velocities |w | < 1 and |w2| < 1 is compared with the 
exact solution in [4] in Fig. 1.4. 


h(t) r 
G P ac, 
h(t) 
h (t) 
unsafe 





h t) zone 


Fig. 1.5. Conflict scenario: Aircraft 2 reaches the boundary of the unsafe zone of aircraft 1 
with a given initial relative angle pr. 


The backward reachable set obtained by using the polytopic approximation is 
overapproximate of the exact reachable set and its computation time is about 1.0 
seconds (including plotting the result as shown in Fig. 1.4) using MATLAB on the 
same PC, where the numerical solution to the exact PDE [5] takes approximately 4 
minutes on a Sun UltraSparc I with 50 grid nodes in each dimension. Fig. 1.5 shows 
a conflict scenario in which aircraft 2 tries to enter the unsafe zone. When aircraft 2 
reaches the boundary of the unsafe zone, the optimal control input for aircraft 1 can 
be easily obtained as follows: 


uï (t) = argmax,,cp{< D,v(az,t), —B(t)ui(t)) >} 


; 1:37 
= arg max, < e~4"*h, (0), -Bul >. pe) 


Fig. 1.6 shows a simulation for conflict resolution between the two aircraft with 
the initial condition (x, = 10, y, = —20, Y, = 115°). Since both aircraft behave 
optimally, the relative position of aircraft 2 moves along the boundary of the unsafe 
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conflict resolution with the optimal strategies y= 115° 
15 T T T T T T 
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15 + J 
20 = ac2 | 
25 0 T5 10 5 0 5 10 15 20 25 
x, 
Fig. 1.6. Conflict resolution simulation with relative initial states (zv = 10, yr = —20, Yr = 


115°). Aircraft 1 tries to avoid a conflict with aircraft 2 with the optimal strategy. 


set. As expected, chattering occurs along the boundary. To avoid such a phenomenon, 
one would introduce a buffer zone around the boundary so that the control inputs 
change smoothly as aircraft 2 approaches the boundary. 

Using similar analysis to the above, we may obtain the underapproximate back- 
ward reachable set. This is obtained for the collision avoidance example, using the 
same parameters, and compared in Fig. 1.7 with the overapproximate set. 


1.5 Conclusions 


The polytopic approximation gives an overapproximation of the exact reachable 
set and is computationally efficient: it requires solving matrix exponentials instead 
of a Hamilton—Jacobi partial differential equation. The data structure of the poly- 
topic approximation method becomes more complicated than that of the ellipsoidal 
approximation method [17] as the number of faces of the polytope increases, yet 
the computation of the matrix exponential is easier than solving the (usually Riccati 
type) differential equation required for the ellipsoidal methods. The optimal control 
input can be easily computed from the Hamiltonian since the Hamiltonian is linear 
with respect to the control, and the control input set is a convex polytope. The poly- 
topic approximation method can be applied to high dimensional systems which may 
not be solved exactly without substantially increasing the computational time. This 
may be done by decomposing the computation of an approximation (over or under) 
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Comparison between over and underapproximations 

















20 5 0 5 10 15 


Fig. 1.7. Comparison between the under and overapproximate backward reachable sets for 
conflict resolution between two aircraft. 


of the reachable set into a number of computations of approximations of subsystem 
reachable sets [29]. 
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Summary. This chapter deals with the problem of measurement feedback control under set- 
membership uncertainty for systems with original linear structure and hard bounds on the 
uncertain items. It indicates feedback control strategies which ensure guaranteed deviation 
from a given terminal set despite the uncertain disturbances and incomplete feedback. Routes 
for numerical treatment of the solutions are suggested on the basis of ellipsoidal techniques. 


2.1 Introduction 


The problem of measurement feedback control under uncertain disturbances 
(noise) is one of the central topics in the theory of control synthesis. It is well mo- 
tivated by applied issues and has been well developed in a stochastic setting as a 
combination of stochastic filtering theory with the theory of stochastic control in it- 
self. However a considerable number of problems in control design have to deal with 
systems subjected to information conditions which are other than stochastic. 

Here the uncertain items are treated as unknown but bounded, with preassigned 
bounds. Such problems have to rely on the theory of guaranteed state estimation, 
where the estimates of the system dynamics are set-valued, with further procedures 
of controlling the evolution of set-valued systems. The related approaches naturally 
require new types of techniques and totally new formalization of the overall mea- 
surement feedback control problems (see [1]-[6]). 

These approaches heavily rely on nonlinear analysis, set-valued calculus as well 
as on minmax theory and differential games (see [7]-[12]), but are not simple and 
lead to rather cumbersome technical and numerical procedures. Such perhaps are the 
reasons why there are relatively few papers on set-valued, minmax or game-theoretic 
approaches to measurement feedback control, though the problems are pending, are 
seriously motivated and have to be solved. 

This chapter gives a solution to a problem of measurement feedback control un- 
der set-membership uncertainty with hard bounds on the uncertain items. Here, the 
problem is solved using procedures based on a specific version of the ellipsoidal 
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technique as developed jointly with P. Varaiya, [13]-[15]. The suggested schemes 
ensure exact solution representations with recursive computation schemes. 


2.2 The Basic Problem. The Cost Functional 


Let us first discuss the setting of the problem and the general approach to its 
solution. Given is the system 
dx/dt = A(t)x + B(t)u+ C(t) f(t), (2.1) 
with continuous matrix coefficients A(t), B(t), C(t) and hard bounds on the control 
u and unknown disturbance f(t): 
u € P(t), f(t) € Q(t). (2.2) 
The system evolution is considered within time interval t € [to,ti]. Here 
P(t), Q(t) are set-valued functions with values in the variety of convex compact sets 
in R?, R4, continuous in the Hausdorff metric. The on-line information on vector x 
arrives from observations due to the measurement equation 
y(t) = H(t)x + &(t), (2.3) 
with y(t) € R” being the available measurement and &(t) the unknown but bounded 
continuous disturbance (measurement noise): 


E E Rit), tE fto, ta]. (2.4) 


The set-valued function R(t) is similar to P(t), and H(t) is continuous. The initial 
condition is given by the inclusion 

a(ty) € X°, (2.5) 
where X° is a given convex compact in R”. 

The pair {to, Xo} is said to be the starting position of the system. Given starting 
position {to, X°}, functions A(t), B(t), C(t), H(t), realization ul[s],s € [to,t), of 
the implemented control as well as the set-valued functions P(t), Q(t), R(t) and the 
measured values 

yt(o) = y(t =e o), oE |- (t = to), 0], 


one may solve the problem of “guaranteed estimation” [18], [17], [5], [19]. This 
problem consists in specifying the “information set” X(t, y:(-)) = X(t, -) = Xft] 
of system (2.1)—(2.4), which consists of the ends x(t) of all the trajectories of system 
(2.1), consistent with equation (2.3), and constraints (2.2), (2.4), (2.5), under given 
realizations u(s), y(s), s € [to, tl. 

The on-line position (state) of the overall system (2.1)-(2.4) may now be taken 
as {t, V[t]}. In a loose setting, Problem I consists in specifying a feedback strat- 
egy U(t, X[t]) which would steer the overall system from any starting position 
{r, €[T]}, 7 € [to,t1] to a preassigned ju-neighborhood M, of a given target set 
M at given time tı, despite the unknown disturbances f and the incomplete mea- 
surements. At the same time, the class U = {U (t, V[t])} of such strategies should 
ensure the existence and prolongability of solutions to the differential inclusion 
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LE A(t)a + B(t)U (t, X [t]) + C(t) f(t), (2.6) 


within the interval t € [to, t1]. 

We will further indicate a rigorous formulation of the problem. However already 
at the preliminary stage one may observe that Problem I may be separated into two, 
namely into Problem GE of guaranteed state estimation and Problem CS of control 
synthesis in a generalized state space. 

The overall Problem I may be posed with the aid of the next functional: 


V(to, X°) = min maz maei = hy lela), X”) 


7 J hala (t) — H(E)e(t), R())dt 


to 


+ hy (a(ti),M) |U EU; f(t) € Q0), t € [to,7]}- D 
Here, y* (t) is the available measurement, 
h+ (Q”,Q') = min{e : Q' + B(0) 2 Q"} 


where Q’, Q” are compacts in R”. 
The value h4 (Q”, Q’) is the Hausdorff semidistance, while 


R(Q", Q) = max{h4(Q", Q’), h4 (Q, Q")} 
is the Hausdorff distance. If Q” = q € R” is an isolated point, then 
h+(Q", Q") = dla, Q') = min{ (q - q',q-q/)"”? |d € Q'} 
is the Euclid distance from point q to set Q’. 


Remark 1. (i) In the previous relation the minimum should be taken over set-valued 
strategies U € U, where U = U(t, X[t]) C P(t) is Hausdorff-continuous in t 
and upper semicontinuous in [t] and where the latter is the information set which 
defines the on-line position {t, Æ [t] } of the overall system. 

(ii) The existence of a solution to equation (2.7) under a given class of strategies 
U = U (t, X [t]) was indicated in [5]. 

(iii) The maximum over z(-) is to be taken over all motions caused by the multivalued 
nature of the related differential inclusion (2.6). 


In order to simplify the calculations we shall treat the given problem in another 
coordinate system. Namely, denoting G'(t, 7) to be the fundamental transition matrix 
of the homogenous system (2.1), 


OG(t 
OND e CE ah 
ot 
Introducing the transformation x = G(t,v)x to a new variable x, making the 


changes and returning after that to the original notations, without loss of general- 
ity we may transform the original system (2.1), (2.2) to 
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p= B(t)u + C(t) f(t), (2.8) 
y(t) = H(t)x + (t), (2.9) 


under constraints (2.2), (2.4), (2.5). 
Before passing to the solution of the formulated problem we shall present system 
(2.8), (2.9) in the form of the next array of systems: 


dx*/dt = B(t)u, x*(to) = 0, (2.10) 
and 
with dw/dt = C(t) f(t), w(to) € ¥°, (2.11) 
2(t) = H(t)z + E(t), (2.12) 
where 


t 
e+w=a, z(t) = y(t) - no f B(s)u*(s)ds. 
to 

With realization u = u*(s), s € [to, t) given, there will be a one-to-one mapping 
between realizations y*(s) and z*(s). Similarly to 4[¢], one may now define the 
information set of system (2.11), (2.12), denoting it as W(t, z(-)) = W(t,-) = 
Wit]. Then we have V[t] = x* (t) + Wft]. 

The given representations allow us to solve Problem E only for system (2.11), 
(2.12), separating this solution from the problem of specifying the control itself. 

We further proceed with solving Problem GE. 


2.3 The Problem of Guaranteed (Minmax) Estimation 


Problem GE for system (2.11), (2.12), may be formulated in two versions: Æ 
and E3. 


Problem E : Given are equations (2.11), (2.12) and starting position {to, ¥°} 
under constraints (2.2), (2.4), (2.5), as well as the available measurements, the real- 
ization z = z*(t), t € [to,7]. Specify the information set W|r], of solutions w(r) 
to system (2.11), consistent with z*(t), t € [to,7] and constraints (2.2), (2.4), (2.5). 

The information set W[r] is the guaranteed estimate of the realized vector w(rT). 
The specification of this set is the subject of the theory of guaranteed (minmax) 
estimation [16], [5],[18], [19], [3]. However, for the problems of this chapter it is 
necessary not only to calculate set W[r], but to arrange the calculations on-line, 
following the evolution of W{r] in time. Such procedures may be organized through 
the solution of the following problem of dynamic optimization. 


Problem Ev: Given are the starting position {to, ¥°}, and the realization z7(-). 
Find value function 


V(t, w) = min{d(w(to), X°) | f(t) € Q(t), t € [to, T]} 


due to equation (2.11), under additional conditions 
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w(r) =w; 2*(s) — Hw(s) € R(s), s € [to, T]. (2.13) 
The second of conditions (2.13) is actually an on-line state constraint. 
Lemma 1. The following relation is true: 
Wir] = {w: V(t, w) < 0}. 


We shall now calculate function V(t, w) by using the techniques of convex anal- 
ysis. In order to do that we fix the given measurement realization z*(t),t € [to,7], 
and consider the relations which are true for d(w(to), ¥°) > 0: 


d(w(to),¥°) = max{ (1,w(to)) = ple?) | <1} a) 


and 


J "a(e*(t) — H(t)w(t), nR = maf T i (aw, ~ (t) 


to 


- H(Ow(t)) = PAWIRO) Jdal) | AC) € Kt, 7I} 


for any A(-) € K[to,7],a(-) € Var+[to, 7]. Here K is a compact set in the space 
C;[to,7] of r-dimensional continuous functions and Var+[to,7] — the space of 
nondecreasing functions of unit variation are selected [21]. The symbol p(l|W) = 
max{ (1, w) | w E€ W} denotes the value of the support function of compact W along 
direction L. 

We have 


V(t, w) = mac (s(r),u) — f p(s(t)|C(#) O(t)) dt 


to 
+ f (0020) -PAORO Jaa = oa1e)} eas 
to 
where s(t) is the solution to the adjoint equation of Problem FE», the equation 
ds = —sA(t)dt — X'(t)H(t)da(t), s(7) =1. (2.16) 


Note that here and in the sequel function z*(t) is considered continuous, while func- 
tion a(t) is right-continuous. The maximums in problem (2.15) are attained and are 
unique. 

Given V (7, w), one may now calculate the support function 


p(l|W{r]) = max{ (1, w)|V (7, w) < 0}. 


(Following [5], one may also calculate the same item directly.) We have 
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p(l|W[r]) = in| oo) PF [ OOICOADAs)das) 


to 


+f " p(AlZ(s))da(s) 


to 





AC) € K[to, T], a(-) € Varsti] }, (2.17) 


where Z(t) = z(t) — R(t), and s(t) is the solution to the adjoint equation (2.16). 
The “motion” of set W[7] may be described by an evolution equation of the 
“funnel” type 


lim hy (W[r + o], WIT] N Z(T)+0C(T)Q(T)}=0 (2.18) 


o—0 


with W[to] = ¥°. Set W[r] will be the maximal solution with respect to inclusion 
of equation (2.18), (see [22], [23]). 
We may now pass to the problem of control in the space of “trajectories” W[r]. 


2.4 The Synthesizing Control 


Consider the on-line position {7, «*, W} of the overall system (2.10)—(2.12). 
Problem CS: For the evolution system (2.10),(2.18), specify a set-valued control 
strategy U? (t, x*, W) D P(t), which would ensure the inclusion 


x (9) + WO] CM + pE(0, T), 


for some u? > 0, whatever be the disturbances f (t), €(t) and the unknown starting 
point w(t) = x(to) subjected to constraints (2.2),(2.4), (2.5) for t € [to, J]. Here 
E(0, 2) = {x : (x, x) < 1}. 

In order to solve the problem of control synthesis, consider first the following 
auxiliary problem. 

Let W (9, z*(-);7, W) be the reach set of system (2.11), (2.12) from position 
{T, W} over inputs f(t) € Q(t), under state constraint 


H(t)w(t) E€ 2*(t) — R(t), t € [7, ð], 


where function z*(t) is generated due to a specific triplet ¢* = {w*, f*(-), €*(-)} 
subjected to constraints 


w EW, f*(t) € P(t), Et) € R(t), te [7, ð]. (2.19) 
Problem CS-V: Find the value function V(v,7, «*, W), according to the formula 
V(0,7,2",W) = a d(x* (0), W (9, 2(-);7, W)), 
with fixed {a*(7) = 2*, W(T) = W}. 


Here the minimum is to be taken over all functions u(s) € P(s), s € [7, 9], and 
the maximum over all triplets ¢ = {w, f(-),€(-)}, where ¢ € Z,(W) and 
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Z,(W) = {{w, FOE} wE W, FE € Ql), EE) ERE), te frd}. 


Remark 2. In the monograph [5] it is demonstrated that the maximum over triplets 
Z,(W) in Problem V-1 is equivalent to the maximum over all pairs {w, z(-)}, where 
w € W and 


z(t) € Z(T,W) = {H(t)w(t) + €(t)|¢ € Z(T,W),, t € [7, Vf, 
due to equality 


w(t) = w+ f C(s) f(s)ds. 


The solution to this problem follows the techniques of convex analysis as applied 
along the lines of monograph [5], sections 17,18, and also [13], [24]. We have the 
next proposition. 


Theorem 1. The following relation is true: 
V(0,7,2*,W) = max{W(1,7,0,2*,W)|(1,1) < 1}, 


where 


0 
(1,7,0,2",W) = l'a" [r] + p(l|WIr]) +/ (CELE) at 


f T 
zs f (p(—I|B(t)P(t))dt — plM). (2.20) 


We will be further interested in the variety æ |t, W] of all convex compact sets of 
type Y[7T] = {a* + W} which satisfy for a fixed W the inequality 


V(r, a*,W) <0, Va* € fr]. (2.21) 


Following conventional reasoning (see [5], [24], [13]) we come to the next conclu- 
sion. 


Corollary 1. Among the variety of sets X |r] € X[T, W] there exists a set X*|7] 
which is maximal with respect to inclusion, namely 


X[r] € X* [7], V¥ [rT] € Xft, W]. 
The set A’*{t] may be represented through a multivalued integral as 
v 
x“ Tt] = (co, v)M — / G(r, s)B(s)P(s)ds 


v 
= J G(r, s)C(s)Q(s)ds (2.22) 
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where 
MAH" = {erat XH" CX 


stands for the geometric (Minkowski) difference of convex sets X’, X”. 


Assumption 1. The function 
0 


k(l,7)) = (ouat, om) + J 


T 


(o(-11G(7, 8) B(3)P(s))ds) 


- f° oar, 015) s))as 
is proper, convex in l for all 7 € [to, 0]. 
We take this assumption to be true. Then 
PUI“ I) = pUlG(r, 9) M) 
+f i (IGE, BP) = UG CA) Jas. 223) 
Relation (2.23) may be presented as a map ¥* [T] = T? M. 


Remark 3. Under Assumption | we shall use set ¥*[t] for constructing the synthe- 
sizing control strategy. In the absence of this assumption one has to consider instead 
of X¥*[|r] a multivalued integral of the alternated Pontryagin type, [25], [26], [15], 
which is the Hausdorff limit 

lim Xy[r] = ¥?[7] 


N—-oo 


of a nonempty superposition 
Anil =I Te M 


taken over a sequence of partitions {7,71,..., TN, V} of the interval [to, 0] with uni- 
formly increasing density when N — oo. 


The emphasis of this chapter is on the ellipsoidal technique rather than the gen- 
eral scheme. This may justify the acceptance of Assumption 1. 

We shall now indicate the solution strategy for solving the Problem CS. Suppose 
that at time 7 the on-line realizations of systems (2.10), (2.11), (2.12) are {a* = 
x*|T], W = Wir] }. Consider function 

V(r, 2", W) = hy(a* + Wir], ¥*[7]) 
= max{(I,2*) + p(l Wir) — p(UX*[r]) | (0,1) < 1}. (2.24) 


This problem has a unique maximizer l? for V(r, 2*, W) > 0. If V(r, x*, W) = 
0 we take /° = 0. Proceeding further, introduce the set-valued strategy 
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0 
U (7, 2*,W) = 


; l jee 
arg ming max4 ——>——— 








wemce zon} 





ue Po}, (2.25) 
u,¢ 


if V(r, £*, W) > 0, and U? (T, 2*, W) = P(r), if V(r, 2*, W) = 0. 

Here dV (T, x*, W)/dt|u,¢ is the total derivative of functional V(r, x*, W) due to 
the “motions” {x*(r), W[r]} described by (2.10), (2.18) and formula (2.17), being 
taken along “directions” {u, Ç}. We shall prove that strategy U? (rT, x*, W) ensures 
functional V(r, x*, W) to be nondecreasing along the “motions” {2*(7), W[r]}. 

Let us calculate the total derivative of function V(7,2*, WV) > 0. Denote the 
unique maximizer in (2.24) as 1°. Using the rules of differentiating functions of 
“maximum” type, we have 


dV(r, x*, W[r])/drlu,e = {(1°, 2*) — d&(I°, 7)}/dr|uc, (2.26) 


where 


B(T) = plM) + 


— p(l|WIr)). 
This gives, in view of relations (2.23),(2.24) 


v 
(isoro = pice) ds 


dV(r, 2*, W[T])/drlu,¢ = (V, B(r)u) + p(—l"|B(r) P(7))) 
= p(lP|C(7)Q(7))) + d(o(?|WIr])/ar Ie, 


where 


d(p(1°|Wir])/drl¢ = (2, CT) f(r) + pQ°H(r)|w — W) 
+ p(r°lE(r) — R(r)). 


Here it was assumed that function \°(7) is the maximizer for a problem such 
as (2.21), but taken for the interval [7,7 + o], o > 0. It is also assumed that 
here the comaximizer a® = t, which means that a°(t) has no jump at time T. 
The maximizer A? (T) is assumed continuous and unique, allowing representation 
dA°(r)/dr = X°(r)da®(r) with A(T) being a function of bounded variation, of di- 
mension m. The uniqueness of maximizer \°(7) may be ensured by assuming R(T) 
to be a nondegenerate ellipsoid (such an œR is taken in the next section). Here we also 
used the property that either \°(7)H(r) = 1° (T), or \°(r) = 0, (see [23], Section 
17). 

Note that the last derivative is calculated along “directions” {u, ¢*} where ¢* = 
{w*, f*(r),€*(7)}. After a maximization over all ¢* € Z,(W), we come to 


max{dV(r, 2", W[r])/arlu,c=|¢ E€ Z,} (2.27) 


=—(-1°, B(r)u)+p(—1°|B(r)P(r)) (2.28) 
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We may now specify the desired solution of Problem CS as strategy 
U? (r, 2*, W) = argmax{(—I°, B(r)u)|u € P(r)}. (2.29) 
Summarizing the above, we come to the assertion 
Theorem 2. Strategy U? (T, x*, W) of (2.25), (2.29) ensures the inequality 
{dV(7, x", WIt})/dr|uc} < 0, 


for any u € U? (T, x*, W), whatever be Ç € Z,. Strategy U? (T, x*, W) depends on 
the maximizer —1° = l° (T, x*, W) of problem (2.24), according to (2.29). 


Thus, the crucial point in solving the overall problem of measurement feedback 
control synthesis is to find the vector /° = 1°(r,2*, W). In its turn, this requires us 
to calculate set X*(7, W(7)), leading one to calculate an array of other set-valued 
functions. The required calculations may appear to be rather cumbersome. We shall 
show that these calculations may be quite feasible if based on ellipsoidal-valued 
calculus as introduced in [13], [24]. 


2.5 A Solution Through Ellipsoidal Techniques 


We shall now indicate some ellipsoidal techniques for solving the problems of 
this chapter. Denote a nondegenerate ellipsoid with center p and shape matrix P~! 
as 


E(p, P) = {x : (x — p, P™ (x — p)) < 1}. 


Note that its support function is 
p(E(p, P)) = (l, £) + (L, PI). 


We further assume the target set M = E(m, M) to be an ellipsoid and the hard 
bounds on (to), u, f, € also to be ellipsoidal, of respective dimensions, namely 


w(to) € X’ = E(x", X°), u € P(t) = E(p(t), P(t), 
f(t) € Q0) = E(a(t), QE), E(t) E R(t) = E(0, R(t). (2.30) 
Here M = M' > 0, X° = X” > 0, and P(t) = P(t)’ > 0, Q(t) = Q(t)’ > 
0, R(t) = R'(t) > 0. 
We will solve the problem of measurement feedback control in several stages. 
Stage 1. Solve Problem E of finding the information set W[r] for system (2.11) 


under constraints (2.30). Here we are actually to describe the reach set W[r] € 
W(r, to, E(x, X°)) of system 


w € CHE), Q(t), wto) € E, X°), 


under on-line state constraint 
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z*(t) — H(t)w(t) € E(0, R(t)), t € [to, T], 
where z*(t) is given. The last inclusion may be rewritten as 
H(t)w(t) € E(z*(t), R(t)). (2.31) 


As indicated in [13], there exists a parametrized family of external ellipsoids 
E(w+(t), W,(t)) which approximate W[r] from above: 


Wir] E E(w+ (t), W+(¢)), 
and are described through ordinary differential equations: 
wy = O(t)a(t) + L(t)z* (t), w(to) = 2°, (2.32) 


W, = —(L'(t)H(t)W + WH" (t)L(t)) + (Tut) + 02 (t)) W3 
Hul CHRE E) + (TAT LOREL (t), Wt (to) = ¥€2.33) 


Here L(-) € K may be chosen in a compact set K of piecewise - continuous matrix 
functions, while mu (t) > 0, m, (t) > 0 among continuous positive functions. 

Let us denote the external ellipsoids as E(w4 (T), W4 (T)|w(T)} indicating their 
dependence on the parametrizing functions 


w(t) = {L(-), Tul), t2(-)}, LO) € K; Tult) > 0, 72t) > 0,t € [to, 7]. 
The next assertion is true (see [14]). 
Theorem 3. Set W[r] may be described either through its support function 
PUWIT) = —inf{pU|E(w+(7), W4 (7) lel) elr), 


or, in set-valued terms, as 


Wir] = {E(w (T), We (7) |w(7)))w(7)F- 


Remark 4. Among the parametrizing functions w(-) there may exist for each vector 
l such triplets w? (T) = {L°(-), n8 (-), 79 (-)}, which ensure the external ellipsoids to 
be tight, namely, 


PUWIT) = pE (w+ (7), W+(7)|w°(7))). 


The description of such parametrizers and a recursive form of their calculation is 
given in [14]. 


Stage 2. Given set W, find set X¥* [r] € A’[7, W]. This set is described by (2.23). 
The calculation of ellipsoidal bounds for such sets was given in [15], [24]. We have 


E_(xe(7), X(r)) © X * [r] C Ey (ae(7), X4(7)) 
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where 
te = B(t)p(t) + C(t)a(t), (2.34) 
X4 = — y(t) X4 — mi) Bt) P()B' (E) 
+X SHACHA) +S OCHO AXE (2.35) 
and 


X- = HOX- +F OCMOQHC'(t) 
— (X* (HBEP! (t) + PPOB ESIXE). (2.36) 


Here XXY = X,, X*X* = X_ and 7,(t), yp(t), are positive, contin- 
uous parametrizing functions, while S(t), S1 (t), S2(t) are piecewise continuous 
parametrizing orthogonal matrices: SS’ = I, S151 = I, S255 = I. The boundary 
conditions are 

ze(0) =m, X40) = X_(0) = M. (2.37) 


According to [15], the given ellipsoidal approximations allow exact representa- 
tions of the approximated set V*|r]. Denote the duplet and triplet defined on [r, V] 
as 


X4(7) = tl), SO), x-0) = tr), $16), Sa) 


and also denote 
E(Xe, X+(T)) = E(@e, X+(7)|x4(7)), Elze, X-(7)) = E(e, X-(7)|x-(7)), 
then we have 
Theorem 4. The following representations are true: 
p(l|¥*[r]) = min p(llE(we, X4(rlx4(r))lxe(7)}, (2.38) 
and 
p(ll%*[r]) = max p(lJE (we, X- (7)lx-(7))lx-(7)} (2.39) 


In the previous relations the minimum and maximum are attained on some specific 
triplets. This is due to existence of tight approximations, [15]. 


Stage 3. Specify the control strategy U(r, x*, W). Here we first have to consider 
the problem (2.24) of finding the maximizer 1°(r, «*, W) for 


V(r, 2", W) = max{(1, 2") + p(!WIr]) — elle "[7))1G,0) < 1} 


but treating it in terms of an ellipsoidal approximation. 

In order to achieve a guaranteed estimate Ve (T, x*, W) of V(r, x*, W), we shall 
substitute an internal ellipsoid E(w, W4} (T)|\w(T)) for X* [7] and an external ellip- 
soid X_(r)|x—(7)) for W[r]. Then we will have 
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Velt, x*, W) < max{(1,2") + p(l|E(we, W+(r)|w(7)) 


— pUE(ae(7), X—(7)|x-(7) 1,0 < 1} = Ve(T, 2", W). (2.40) 


We shall now calculate the total derivative dVe(7, «*, W)/dr. Applying rela- 
tions (2.32)—(2.36), we shall use the specific parametrizing functions 7s, mz, yf and 


matrices S4 that ensure the respective ellipsoids E(w., W4), E(£e, X_) to be tight. 


Such parametrizers will be marked by upper index 0, as mf = TT, w = w®, for exam- 


ple. The necessary formulas for these “tightening” parametrizers are borrowed from 
[15]. 

Denote the unique maximizer for (2.40) with Ve(7,2*,W) > 0 as le and for 
Ve(r,2*,W) = 0 take maximizer le = 0. Applying the rules for differentiating 
functions of “maximum” type, we have 


d(l.,x*)/dr = (le, B(r)u(r)), (2.41) 


dp(lelE(xe(),X—(r)|x* (7))/dr= (le B(r)p(r) + C(r) (7) (le, X- (Tle)? 
x (ON Xe) + (vf) (7)(le, CCT le) 


= alle XulT)Su(T)B()PY*Ee) 


= (le, B(r)p(r) + C(r)a(r)) + (le, C(TQ(T)C"(7) le)? 
— (le, B(r)P(r)B'(r)le). (2.42) 


Here we have used the relations 


PT) = (le, CITC Ly tans), 


(le, X* (1)S(T)B(T)P™?(T)le) = (le, B(T)P(7)B' (T)le)™? (le, X-(7)le)? 


of [15], Section 5. 
Proceeding mii and foovane Remark 4, denote the optimal parametrizing 
triplet for le as w?r = {L}, 7¢.,72.}. Then we have 


dp nl E(ul7), Wa (P)O) er = 5 (le, Wale)? ( (le (EE (HCW 
+ WyH(r)L9(7) le) + (aGe(r) + 22.(1)) We 
+ (HB. Ml CNAE) = (#85) Mle EE (ROLLA) ) 


= (le, L3 H(r)we) + (le, O(7)a(7)) + (le, L2z(t)) 





—(le, LO (7) H(7)W4le) (le, Wale) */? + (le, C'(T)Q(7)C (rte)? 
— (le, LY (7) R(r)L2(r)le)/? + (le, L? (7) H (1) (w(7) — we(T))) 
+ (le, LE (r)E(T)) + (le, C(r)a(7)). (2.43) 
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In the lines above we have used relations 
T7 (T) = (le, C(r)le) "(les Wale)”, 
TzelT) = (le, Le (7) R(r)Le(r)le)/ (le, We)”. 
Using the Cauchy—Bunyakovski inequality 


(le, Le(7) A (7) Wyle) = (H'(7)Le(T)le, Wyle) 
< (H'(T)Le(T)le, WH (7) Le(T)le)*/? (le, Wyle)? 


and summarizing (2.41) - (2.43), we come to the next assertion. 


Lemma 2. The following formula is true: 





dVe(T, 2*, W)/dr = (le, B(T) (u(r) — p(T (le, B(r) P(r) B'(r)le) 
b (le, LO (r)E(T)) — (le, LY (T)R(T)LL(T)le)"? 
+ (H"(r)L2(r)le, (W(T) — we(T))) 
— (H'(T)LL(T)le, W4 H'(T)LL(T)le)"/. (2.44) 


Let us now specify the control Ue (T, £* + W) as 
U} (T, 2* + W) = arg max{(—le, u)|u € E(p, P)}. (2.45) 
Relation (2.44) yields the conclusion: 


Theorem 5. Suppose u? € U} (T, x*). Then the following is true: 
(i) the inequality 
dVe (T, g”; W)/dT|uo,z <0, (2.46) 


whatever be the elements 
z= Hw+é, €€ €(0,R), we Elwe Wy); 


(ii) with Ve(to,0,E(a°, X°)) = u? > 0 the strategy U(r, x*) yields the in- 
equality Vg (T, x*, W) < p°, ensuring at time ù the inclusion 


a* (0) + Wd] CM + p°E(0, 1), 


whatever be the system and measurement input disturbances f(t), E(t) and the un- 
known initial vector x(to) € €(x°, X?). 


Strategy U} (T, x*, W) depends on vector le(T, £*, W) which at each instant of 
time 7 ensures the inequality 


(le, a" (T)) + pllelE(we(7), W+(7)lw(7)) = p(lelE (te(r), X(T) 
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for some p(T) < u?, being the unique maximizer in (2.40). Moreover, the approxi- 
mating ellipsoids are selected through their tightening parametrizers, so as to ensure 
the equality 1° = le for the maximizers of (2.24), (2.40), with 


p(1°|a* (T) + Wir])) = p(2?|a* (T) + E(we(r), W4(7))) 
p(1|E(ae(r), X_(r)) + w(7)E(O, T) 
p(1?|&*(r)) + u(7)E(0, T) (2.47) 


for some u(T) > 0. Then the Hausdorff semidistance 
hy (a* + WO], ¥*(9)) < p°. 


Remark 5. (i) The specification of the desired control strategy is thus reduced to the 
problem of maximizing a function of type 


(1,2) — (I, X)! + (l,Wi)/?, X =X’ >0,W=W'>0, 


over a unit ball: 1 € €(0,J), where the latter may be substituted by an ellipsoid 
€(0,D), D = D’ > 0. Selecting an appropriate matrix D may facilitate the maxi- 
mization. Such, for example, is the case, when 


E(0, D) C E(0, X*)—E(0, W,), 


and the approximation is tight. 

(ii) Note that all the solutions to the differential equations for the ellipsoids used 
throughout this section are given in recurrent form. This is because all the parameters 
and parametrizing functions in these equations do not require recalculation anew 
online. 

Gii) In the absence of Assumption 1 the essential schemes and the ellipsoidal 
formulas are the same, but the theoretical proofs of Section 3 would be longer, as 
they have to involve manipulations with alternated integrals. 


2.6 Conclusion 


In this chapter we indicated a solution scheme for the problem of measurement 
feedback control under set-membership uncertainty with hard bounds on the uncer- 
tain items. The solution is given in terms of guaranteed estimation theory and set- 
valued control procedures. The recommended numerical solution is based on apply- 
ing tight ellipsoidal representations of set-valued functions, making use of recurrent 
dynamic relations in the form introduced by P. Varaiya and the author. 
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3.1 Prologue 


Pravin Varaiya’s research career is marked by an ever-expanding breadth of in- 
terests starting with classical areas of electrical and communication engineering, but 
frequently intersecting with fields as far apart as highway traffic systems, game the- 
ory and economics. Indirectly through his students, post-doctoral fellows, “mentees” 
and even others, who came in contact with him only in chance encounters, his intel- 
lectual reach has gone much further. 

From the mid-1990’s to the present, a research theme that Pravin Varaiya has 
explored deeply concerns with “hybrid automata.” These are systems describing a 
discrete program in a continuous environment. The best natural example that comes 
to mind would be a description of developmental stages of an organism embedded 
inside an environment composed of a variety of biological macromolecules (DNA, 
RNA and protein) synthesizing, duplicating, modulating and degrading each other in 
a complex manner. The basic developmental program interacts with the environment 
through injuries, infection, immune interactions, mutations, diseases, aging and evo- 
lutionary processes. While unfortunately the asymptotic destinies of these systems 
and their components are degradation, death, and extinction, the transient behaviors 
of these hybrid automata remain infinitely fascinating to us for obvious reasons. 

Consequently, even though hybrid automata of the kind that Pravin Varaiya ex- 
plored were motivated by examples from complex engineered systems, there are 
many questions that he had raised in the engineering context that remain equally in- 
teresting also in the biological situation. In a paper that Pravin Varaiya wrote with 
Mikhail Kourjanski, they explored the question of how to characterize “stability of 


* This work was supported by grants from NSF’s Qubic program, NSF’s ITR program, De- 
fense Advanced Research Projects Agency (DARPA), Howard Hughes Medical Institute 
(HHMI) biomedical support research grant, the US Department of Energy (DOE), the US 
air force (AFRL), National Institutes of Health (NIH) and New York State Office of Sci- 
ence, Technology & Academic Research (NYSTAR). 
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hybrid systems.” (See [16].) In this paper they studied a particular class of hybrid au- 
tomata that are now called rectangular automata, and restricted their attention to the 
ones in which discrete states go through a loop and also contains an infinite trajectory 
starting from some state. Such a viable system was shown to be exactly characterized 
by rectangular systems with fixed point or infinite cycle. 

Because of our biological motivations, we extend the notion to functional hybrid 
automata whose flow and reset conditions are based on real functions (and even 
further restricted to semi-algebraic functions when we seek algorithmic solutions). 
We are now able to ask similar questions about stability (rather simple in this case) 
and limit cycles. 

In particular, we show that functional hybrid automata, which can be used to 
model biological systems, can be reduced to systems of 

differential equations. As a consequence many results obtained in dynamical sys- 
tems theory (e.g., Lyapunov’s stability theorems and LaSalle invariance principle 
[17]) apply mutatis mutandis. 

The chapter is organized as follows: we start with a brief but comprehensive 
overview of biological system models and one interesting example, the circadian 
clock, whose cyclic rhythm governs our daily function (Section 3.2), and follow it 
with a formal introduction to functional hybrid automata and the question of their 
stability (Section 3.3). We then focus on our technical approach involving a direct 
translation of a subclass of functional hybrid automata into systems of differential 
equations (Section 3.4), thus making our problem amenable to classical approaches. 
We place our work in the context of other related works (Section 3.5) and conclude 
in Section 3.6 with a discussion of how new challenges from systems biology may 
rely on the revolution that Pravin Varaiya and his colleagues started. 


3.2 Biological System Models 


The central dogma of biology translates easily to a mathematical formalism for 
biochemical processes involved in gene regulation. This principle states that bio- 
chemical information flow in cells is unidirectional—DNA molecules code informa- 
tion that gets transcribed into RNA, and RNA then gets translated into proteins. To 
model a regulatory system for genes, we must also include an important subclass of 
proteins (transcription activators), which also affects and modulates the transcription 
processes itself, thus completing the cycle. We can write down kinetic mass-action 
equations for the time variation of the concentrations of these species, in the form 
of a system of ordinary differential equations (ODE’s) [10, 15, 24]. In particular, 
the transcription process can be described by equations of the Hill type, with its Hill 
coefficient n depending on the cooperativity among the transcription binding sites. 
If the concentration of DNA and RNA are denoted by Mz, My, etc., and those of 
proteins by P,, Py, etc., then the relevant equations are of the form: 
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M. kiM, +k see 3.1 
ig = —kı My + STF Pn’ (3.1) 
P, = —koP, + k4Mz. (3.2) 


Each equation above is an algebraic differential equation consisting of two alge- 
braic terms, a positive term, representing synthesis and a negative term, represent- 
ing degradation. For both RNA and DNA the degradation is represented by a linear 
function; for RNA, synthesis through transcription is a highly nonlinear but a ra- 
tional Hill-type function; and for proteins, synthesis through translation is a linear 
function of the RNA concentration. In the equation for transcription, when n = 1, 
the equations are called Michaelis-Menten equations; P, denotes the concentration 
of proteins involved in the transcription initiation of the DNA, kı and kə are the 
forward rate constants of the degradation of RNA and proteins, respectively, k and 
k4 are the rate constants for RNA and protein synthesis and 6 models the saturation 
effects in transcription. 

If one knew all the species involved in any one pathway, the mass-action equa- 
tions for the system could be expressed in the form 


Xi = fi, Xo,...,Xn), i=1,2,...,n. (3.3) 


When the number of species becomes large, the complexity of the system of dif- 
ferential equations grows rapidly. Furthermore, the mathematics of the dynamical 
system becomes increasingly complex. The integrability of the system of equations, 
for example, depends on the algebraic properties of appropriate bracket operations 
[19, 20]. We can approximately describe the behavior of such a system using a hy- 
brid automata [3, 21]. The discrete states of the hybrid system describe regimes 
of system behavior which are qualitatively different in terms of which species and 
reactions predominate, and so forth. The “flows,” “invariants,” “guards,” and “reset” 
conditions can be approximated by algebraic systems and the decision procedures for 
determining various properties of these biological systems can be developed using 
the methods of symbolic algorithmic algebra. As we enlarge the scopes of the biolog- 
ical models by considering metabolic processes, signal transduction processes and 
subcellular biochemical processes that are specific to locations and transportation 
between cellular compartmentalizations, the challenges to the algorithmic complex- 
ity and approximability deepen the need for better algorithmic algebraic techniques. 
In the process, we are also forced to explore the connection among constructive ap- 
proaches for differential algebra, commutative algebra, Tarski-algebra, etc. 

As a simple illustrative example, where its limiting cyclic behavior is rather im- 
portant, consider the following model of “circadian clock.” A widely-studied model 
of the mechanism for circadian rhythm was first proposed by Goldbeter [14] in terms 
of the dynamics involved in the degradation of the period protein (PER) and took into 
account multiple phosphorylation of PER and the negative feedback exerted by PER 
on the transcription of the period (per) gene. Informally, the per gene transcribes its 
corresponding mRNA in the nucleus at a rate negatively governed by nuclear PER 
protein—more nuclear PER protein implies less per mRNA and vice versa. The tran- 
scribed per mRNA leaves the nucleus to get translated into PER protein, which after 
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post-translational modifications (several successive phosphorylation steps) diffuses 
back into the nucleus—more per mRNA implies more nuclear PER protein and vice 
versa. All these effects can be expressed succinctly in the forms of the ODEs we 
have described earlier. This minimal biochemical model, supported by experimental 
observations, resulted in a better understanding of the limit cycle of the molecular 
dynamics inherent to circadian oscillation. The mathematical model, created from 
the Michaelis—Mentens type kinetic models, is a five-dimensional system of first- 
order-ODE’s and involved algebraic rational functions of low degree. 

A more detailed model takes into account the role played by the formation of a 
complex between the PER and TIM proteins, and requires considering a sequence 
of steps for TIM similar to the ones shown below. The more complex system is 10 
dimensional and omitted from discussion. Including further evidence that the TIM 
light response is relevant to light-induced phase shifts of the circadian clock, and 
its modeling through discrete mode switches, bring us back to the realm of hybrid 
automata. While we do not describe such a complex model here, we do emphasize 
the fact that understanding the limiting behavior of hybrid models such as these are 
important if we wish to understand how light acts as a major environmental signal 
for the entrainment of circadian rhythms. 

In the equations below: per mRNA, whose cytosolic concentration is denoted 
by M, is synthesized in the nucleus and transferred into the cytosol, where it is 
degraded; the rate of synthesis of PER is proportional to M. In order to take into 
account the fact that PER is multiply phosphorylated, while keeping the model as 
simple as possible, only three states of the protein are considered: unphosphorylated 
(Po), monophosphorylated (P,) and bisphosphorylated (P2), Py is the nuclear PER 
protein. 

Crucial to the mechanism of oscillations in the model is the negative feedback 
exerted by the nuclear form Py in the formation of the PER-TIM complex on the 
synthesis of per (and, in the more detailed model, also tim) mRNAs. The negative 
feedback is described by a Hill-type equation. The equations below are also some- 
what idealized as they ignore the linear degradation terms characterized by a rela- 
tively small, nonspecific rate constant. This rate constant does not play an important 
role in the system’s oscillatory behavior but ensures that a steady state exists even 
when degradations are inhibited. 


K? M 























Muy (K? + PR) OTR tM) 8.4) 
Py = ks5M -V1 es F +V Tes By’ (3.5) 
i Kt py E a Rt Bp ees Ry &® 
P = V Ta Pd Va a Ps) K1P2 + K2PN — va Tas Py’ (3.7) 
Py = Kı Po — K2Py, (3.8) 


P; = Po + Pi + P2+ Pn. (3.9) 
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The mathematical model indicates that during oscillation, the peak in per mRNA 
precedes by several hours the total PER protein. The key insight was that multiple 
PER phosphorylation introduces time-delays which strengthen the negative feedback 
to produce oscillation. An algebraic analysis shows that the rhythm only occurs in 
a range bounded by two critical values of the “maximum rate of PER degradation.” 
The same analysis can be used to show a “rough homeomorphism” between this 
high-dimensional system and a simpler two-dimensional van der Pol equation. The 
other critical parameter was found to be the “average rate of PER transport into the 
nucleus.” The critical dependence of the limit cycle on the degradation parameter 
was a key for biologists to understand the altered period of per mutants. 

In future, we may wish to study further extensions of this initial model: the PER- 
TIM model of Goldbeter, that incorporates the other protein TIM, whose dimeriza- 
tion with PER plays an important role in providing stability to the limit cycle; a 
better model of Tyson et al., that takes into account the detailed structure of PER- 
phosphorylation and inherent competition among several key processes and light- 
sensitivity of TIM. Many of these detailed models will require description in terms 
of hybrid modes. While these extended models are more complex, they appear to re- 
main homeomorphic to simple van-der-Pol-like system, while adding to the stability 
of the over-all system. 

Another interesting avenue to explore concerns the feasibility of synthetic cellu- 
lar clocks. Is it feasible to design simple oscillating systems of a desired periodicity 
by genetic engineering in appropriate cell hosts? If so, such a system could be used 
as a stringent test system of our ability to model complex cellular pathways. We may 
conceive of a simple transcriptional feedback system, using temperature sensitive 
competitive inhibitors (so that clocks can be reset by temperature shifts) and fluores- 
cent reporter systems (so that the phase of the cycle can be examined in individual 
cells and in the population). The advantages of such a system reside in its ease of 
manipulation, ease of monitoring, coupled to the use of genetic selection to explore 
unanticipated behaviors. 


3.3 Hybrid Automata: Stability and Limit Cycles 


3.3.1 Functional Hybrid Automata—Syntax 


The notion of Hybrid Automata was first introduced in [4] as a model and speci- 
fication language for systems consisting of a discrete program within a continuously 
changing environment. For our purpose, it is convenient to introduce a specialized 
notion of functional hybrid automata, whose flow and reset conditions are further 
restricted to functions over the reals. 

Following notations and conventions will be used through out the chapter: Capi- 
tal letters Z,,...,Z,,Z,..., Zņ will denote variables which range over IR. More- 
over, Z will denote the vector of variables (Z,,..., Z;,); similarly, Z’ will denote the 
vector (Zj,..., Zp) and Z”, the vector (ZŤ, ..., Zg). The variable T will be used 
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for time, ranging over IR*. The small letters Dp, q, 1, 8,... Will denote k-dimensional 
vectors of real numbers. 

Given a formula (function) y we will use the notation y(Z, ..., Zn) to stress 
the fact that the set of variables occurring in ọ is included in {Z,,..., Z,,}. By 
extension, (Zt, ..., Z”) will indicate that the variables of y are included in the 
set of components of the vectors Z!, ..., Z”. Given a formula (function) (Z1, 
L., Z 4, Z$, Zit... , Z”), the formula (function) obtained by componentwise 
substitution of the elements of Zt with the elements of p will be denoted by y(Z1, 
...,Z'1, p, Zitt, ..., Z”). If the only variables in y are the elements of Zt, then 
after the substitution, the value of (p) will be assumed to be available. 


Definition 1 (Hybrid Automata). A hybrid automaton H = (Z, Z, Z', V, €, Inv, 
Flow, Act, Reset) of dimension k has the following components: 


© Z= (Zo kb Z= (Zi, Si a Zk), and Z' = (Zi, ..., Zi) are vectors of 
variables ranging over R; Z denotes the values of the continuous variables; a 
denotes the first-order derivatives taken with respect to the time T € R* during 
continuous change; Z' denotes the values after a discrete jump; 

e (V, E) is a finite directed graph; the nodes, V, are called control modes, the 
edges, E, are called control switches; 

e Each vertex v € V is labeled by the formulae Inv(v)(Z) and Flow(v)(Z, Z); 
Inv = {Inv(v)(Z) |v € V} and Flow = {Flow(v)(Z, Z) |v € V}; 

e Each edge e € E is labeled by the formulae Act(e)(Z) and Reset(e)(Z, Z"); 
Act = {Act(e)(Z) | e € E} and Reset = { Reset(e)(Z, Z’) |e € E}. 


Example I. Consider the following simple hybrid automaton “oscillating” between 
two values: 


Act: Z=3 






Reset: Z'=Z 


Reset: Z'=Z 


Starting in the control mode to the left Z grows at constant rate of 1. After 3 
time units, upon reaching the value of Z = 3, it immediately jumps to the alternate 
control mode to the right, where Z now decreases until it reaches a value of Z = 1. 
Under this condition, it jumps back to the mode to the left. The automaton moves 
back and forth forever between these two modes. 


Definition 2 (Functional Hybrid Automata and its Syntax). A functional hybrid 
automaton H = (Z, Z, Z', V, E, Inv, Flow, Act, Reset) of dimension k is a hybrid 
automaton of the same dimension satisfying the following additional properties: 
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¢ Each invariant formula Inv(v) characterizes a closed subset of RF; 

* Each flow formula Flow(v) is of the form Z = (v)(Z) and the Cauchy problem 
Z = W(v)(Z) with initial condition Z(0) = r has a unique solution for each r 
satisfying Inv(v); 

¢ For each r on the frontier set of the invariant 6(Inv(v)) the solution Z = Y(T) 
of the Cauchy problem Z = 1(v)(Z) with initial condition Z(0) = r further 
satisfies the following property: 


Ve > 0, y(e) ¢ Inv(v); 


¢ Each activity formula Act((v, u)) characterizes a subset of the frontier set 
d(Inv(v)); 

¢ Each reset formula Reset(e) is of the form Z' = p(e)(Z), where p(e) is an 
injective function. 


Example 2. The hybrid automaton of Example | is a functional hybrid automaton. 
For another example, see the hybrid automata proposed in [13] to model the Delta- 
Notch signaling process; these can be rewritten as functional hybrid automata by 
using closed invariant conditions. This change has no effect on the behaviors of the 
automata. 


Henceforth, we restrict our discussions only to functional hybrid automata. 


3.3.2 Hybrid Automata—Semantics 


The semantics of functional hybrid automata can be defined in terms of execution 
traces. Traces are sequences of pairs with each pair consisting of a point and a control 
mode. Maximal traces are traces which cannot be extended. 


Definition 3 (Functional Hybrid Automata and its Semantics). 

Let H = (Z, Z, Z', V, E£, Inv, Flow, Act, Reset) be a hybrid automaton of 
dimension k. 

A location ¢ of H is a pair (v, r), where v € V is a state and r = (ry, ..., 
rp) € RÝ is an assignment of values for the variables of Z. An admissible location 
(v, r) is one for which Inv(v)(r) holds. 

The continuous reachability transition relation —¢ between admissible locations 
is defined as follows: 


(v, r) >c (v, s) 


iff At > 0, (o =r f(t) =sAW € [0,1] wE) 





where f is the solution of the Cauchy problem Z = w(v)(Z) with initial condition 
Z(0) =r. 

The discrete reachability transition relation — p between admissible locations is 
defined as follows: 
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(u,r) >p (u, §) 


iff (v,u) E€ EA Act((v,u))(r) A s = p((u, u))(r). 


A trace of H is a sequence bo, ¢1, ..., ln, ... of admissible locations such that 
for eachi > 0 either li >c li+1 or li >p 41. A trace of H is maximal if it is not 
a proper prefix of another trace of H. 


Notice that our definition of trace is rather general: (1) the length of a trace can 
be either finite or infinite; (2) maximal traces can be of finite length. 


3.3.3 Cyclic Traces 


As discussed in Section 3.2, well-controlled robust periodic behavior is crucial to 
many biological systems: cell cycles, circadian clocks, cyclic expression patterns of 
segmentation clocks (e.g., the Delta/Notch signal transduction system), etc. When we 
model them with hybrid automata (see, e.g., [2, 13]) periodic behaviors correspond 
to cyclic traces. Hence, for a given hybrid automaton H, one may wish to determine: 


Can this hybrid automaton H exhibit a cyclic trace? More formally, does there exist 
a trace of H taking the form bo, 4, ..., bn, lo with n > 0? 


There are only a handful of results that directly and explicitly address this ques- 
tion in the context of hybrid automata—efforts directed at the question of stabil- 
ity of cyclic traces are even rarer. In fact, since hybrid automata are highly non- 
deterministic, the problem of analyzing cyclic trace in the full generality is difficult. 
This limitation does not always apply, when it comes to biological systems. Hence, 
by modeling biochemical processes with functional hybrid automata, we try to limit 
the non-determinism, and exploit this property to study cyclic traces by suitably mod- 
ifying results developed in the area of dynamic systems. 

Let us begin by classifying cyclic traces in order to understand what makes them 
difficult to detect. If (v, r} is an admissible location of H, such that y(v)(r) = 0, 
then the trace (v, r}, (v, r} is a cyclic trace of H. We call such a cyclic trace a first 
gender cycle. 


Proposition 1. Let H be a functional hybrid automaton. If for each vertex v the func- 
tion (v) and the formula Inv(v) are polynomials over the reals, then the existence 
of first gender cycles in H is decidable. 


Proof. For each vertex v consider the following first order formula 
Inv(v)(Z) A w(v)(Z) = 0. 


The solutions of this formula are the points r such that (v, r), (uv, r) is a first gender 
cycle. Since the satisfiability of the formula for any vertex v is decidable [22] and 
since the number of nodes v is finite, the first gender cycle problem is decidable, as 
claimed. 
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We remark parenthetically that the result, shown above, can also be extended to 
o-minimal theories [12]. 

Assume further that ~(v) is such that a point r satisfying Inv(v) exists and the 
solution of the Cauchy problem with initial condition Z(0) = r is a periodic function 
with its image included in Jnv(v). Then the trace (v, r}, (v, r} is a cyclic trace of H. 
We call a cyclic trace of this form a second gender cycle. 

In order to detect second gender cycles it is necessary to study all the differen- 
tial systems ~(v)’s and check if they admit periodic solutions. Many results have 
been developed in the areas of dynamical systems and numerical analysis to detect 
periodic solutions and study their stability properties. Most of these results are built 
upon Lyapunov’s stability theorems and LaSalle invariance principle [17]. Principles 
which apply to monotone systems have been recently studied in [6, 7]. 

In general, a cyclic trace can be (vo, ro), (U1, r1), :--, (Uns Tn}, (vo, ro) and 
may contain repeated copies of several discrete nodes internally, i.e., there may exist 
iA j < n with v; = vj. We will call a cyclic trace of this form a third gender cycle, 
a detailed study of which is the key topic of this chapter. In particular, we aim to 
reduce this problem to a more classical problem: namely, that of studying periodic 
solutions of systems of differential equations, as in the case of second gender cycles. 

In a trace there could be many consecutive continuous transitions as well as many 
consecutive discrete transitions. However, when we are looking for cyclic traces we 
can restrict our attention to traces in which each continuous transition is followed by 
a discrete transition. 


Definition 4. Let H be a functional hybrid automaton. A trace bọ, (1, ..., ln, ... is 
said to be in normal form if it holds that li >c ¢;41 implies li41 Ao li+2, for 
2 (en eee 


Lemma 1. Let H be a functional hybrid automaton. If H admits a cyclic trace, then 
it admits a cyclic trace in normal form. 


Proof. Let T = fo, ..., En, lo be a cyclic trace of H. If n = 0, then the trace 
is already in normal form. Otherwise, n > 0, and each place the trace contains 
a subsequence of the form 0; >c ¢;41 >c li+2 in T, we may replace it with 
Lli >c li+2. By repeated replacement of this kind, until it is no longer possible, we 
obtain a sequence which is a cyclic trace of H and is in normal form. 














3.4 From Deterministic Hybrid Automata to ODEs 
In our definition of functional hybrid automata we limit the non-determinism to 
the following cases: 


1. There exists a point which satisfies more than one invariant condition; 
2. There exists a point which satisfies more than one activation condition. 


Inside a vertex, the behavior of a functional hybrid automaton, by the second 
condition of Definition 2, is deterministic, as it imposes existence and uniqueness of 
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the solution for each initial condition. Note, further, that when a functional hybrid 
automaton reaches the frontier of an invariant, it must jump immediately, since we 
imposed that the solutions immediately cross the frontier. Once the automaton de- 
cides (perhaps non-deterministically) which edge it may take, it uses a reset condition 
in a deterministic manner, as its reset condition is a function. Thus it remains to show 
that this second source of nondeterminism can be removed, and we can translate a 
functional hybrid automaton into a system of differential equations. 


Definition 5 (Deterministic Functional Hybrid Automata). Let H = (Z, Z Ze 
V, E, Inv, Flow, Act, Reset) be a functional hybrid automaton. We say that H is 
deterministic, if for each vertex v € V and for each pair of edges e1, e2 € E witha 
common source vertex v we have 


Act(e1) N Act(e2) = 0. 


In our definition of deterministic functional hybrid automata there is still an ap- 
parent source of non-determinism and it is due to the fact that given a point r € R* 
it is possible to start from more than one location of the form (v, r). 


Lemma 2. Let H be a deterministic functional hybrid automaton and (v, r} be an 
admissible location of H. Then there exists one maximal trace in normal form bọ, €1, 
ln, ... with ly = (v, r). 


Proof. The sequence (v, r} is always a trace of H. Hence, it can be extended to at 
least one maximal trace Tr. As in the proof of Lemma 1, we can map Tr into a 
maximal trace in normal form which starts from (v, r). 
We may derive a contradiction as follows, by assuming that there are two maxi- 
mal traces in normal form, both starting from (v, r}. We use £0, £1,..., En, -.. and 
o 4, .--, €,,... to denote the two traces. Let i be the smallest index such that 
Li A L.. It must be that i > 0. The following four cases must be considered: 


1. lii r li and lii = Ei = k: 
2. lii ep li and i—i = Cy => l; 
3; i—i = li and li—ı = Či =r) li; 
4. i—i =p li and i—i = Li = Li. 


Since the last two cases are essentially equivalent, we need consider only the first 
three cases. The first case can be ruled out since in each control mode the solutions 
of the differential equations are unique. The second case cannot occur since the acti- 
vation conditions of H are disjoint and the reset are functional. Finally, the third case 
cannot occur because from ¢;_1 = ¢4_, >p &; we conclude that 4;—1 = (u, s) and s 
is on the frontier of Inv(w), implying that the solution of Z = y(u) (Z) goes outside 
Inv(u). This leads to the desired contradiction: it cannot be that 0;_1 >c 4i. 














Given an admissible location £ we use the notation T'r() to denote the maximal 
trace in normal form starting from £. 

Henceforth, we focus our attention on a deterministic functional hybrid automa- 
ton H. We aim to encode H into a system of differential equations whose solutions 
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correspond to the traces of H. We start by encoding the nodes of V. Let |V| = n, and 
consider an ordering [v;, ..., Un] of V. We map each vertex of V to a point in R” as 
follows: 


u: YVR” 
vi — (0,0,...,1,...,0), 


where 1 is in position 2. 

Let R1, ..., Rn, S1,.-., Sn be 2n = |V| fresh variables. Let also W1, ..., Wy 
be k fresh variables, where k is the dimension of H. 

For each vertex v € V we consider the system of differential equations ¥ (v) on 
IR2*+2” defined as: 


Z =%(v)(Z) 
W = yw)(W) 
Rk =0 
Š =0. 


This system describes the continuous evolution in v. The variables Z’s and W’s 
evolve as described in the mode v. The variables R’s and S’s do not evolve. They are 
used simply to encode the fact that the automaton is in vertex v. 

Now we can glue together the systems of the different modes, i.e., we will encode 
the discrete jumps into differential systems. The basic ideas behind the encoding are 
as follows. Let us assume that we are in a point of the form (z, z, (vi), w(vi)) and 
z satisfies Act((v;, v;)). We use two time instants to jump from (z, z, (vi), W(v:)) 
to (p((v;, v;))(z), p( (vi, ¥;))(Z), u(vj), “(v;)). During the first instant: Z moves 
on the segment between z and p((v;, v;))(z) at constant speed p((v;, vj))(z) — 2: 
W remains fixed since it is used to determine the constant speed at which Z moves; 
R moves on the segment between ju(v;) and u(v;) at constant speed 1; S does not 
move so that it is clear that we are moving from ju(v;) to u(vj) and not the converse. 
During the second instant we need to update W and S. Hence in this case, Z does not 
move; W moves on the segment between z and p((v;, v;))(z) at constant speed; R 
does not move; © moves on the segment between ju(v;) and u(vj) at constant speed. 
In particular, to determine the segment on which W has to move we need to use the 
values of S and R after one instant (these encode the edge) and the value of Z after 
one instant (to determine the constant speed). 

We start with the system for the first instant. For each edge (v;, v;) we consider 
the system YW ((v;, v;)) defined as: 


((vis ¥j))(W) — W 


Il 


Ż =p 
Ww 0 
R = p(v;) — (vi) 
§ =0 


Il 


As far as the second instant is concerned, we proceed as follows: For each edge 
(vi, vj) we consider the system W((v;, vj) ) defined as: 
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Z =0 
W = Z— p7*((vi, vi) (Z) 
R=0 


S= p(vj) — u(i). 


To conclude our construction, we collect and assemble the systems ¥ (v), Yı (e), 
and W(e) combining the invariant and activation conditions of H. For a given for- 
mula 7(Z) whose solutions denote a subset G C R*, we use Op(y)(Z) to denote 
the formula associated with the interior of G. Moreover, consider @, the system of 
differential equations which equates all the derivatives to 0. Let the system H be 
defined as follows: 


Y(vi), if Op(Init(vi))(Z) A R = S = u(vi); 

Pi( (vi, vj)), if Act( (vi, vi} (W) AS = u(vi) A Rj < 1; 
Wo((v;,v7)), if R = p(v;) A Si > 0; 

Ø, otherwise. 


Notice that this construction uses 2n variables to encode the discrete part of the 
automaton. This construction avoids intersections of the solutions during the jumps. 
We could obtain the same result using only 6 variables, since given n points in R? 
we can always connect them with n? non-intersecting curves. 

We prove that the solutions of the system H and the traces of the deterministic 
functional hybrid automaton H correspond to each other, i.e., they are in a sense 
equivalent. We limit our arguments to traces of infinite length, since for cyclic solu- 
tions this suffices. The definitions and results can be modified appropriately to deal 
with traces of finite length. 


Definition 6. Let H be a functional hybrid automaton of dimension k with n control 


modes. Let f : Rt — R”+E be a function and Tr = 0, €1, ..., £m, ... bea 
trace of H of infinite length. We say that f and Tr agree if there exists an increasing 
sequence to, tı, ..., tm, ..- of positive reals such that for each i, €; = f (ti). 


Theorem 1. Let H be a deterministic functional hybrid automaton of dimension k 
and (v, r) be an admissible location of H such that Tr((v, r)) has infinite length. 
The solution Z = Fior) (t), R= For) (t) of H with initial conditions Z = W = r 
and R = S = u(v) and the trace Tr((v, r)) agree. 


Proof. We use Z = Fior ©) W = fers R= For ©) and S = Fior) (t) to 
denote the solution of H with initial conditions Z = W = r and R = S = pu(v). We 
have to define the sequence to, t1,..., tm, ... satisfying Definition 6. Let Tr((v, 
r)) be of the form (uv, r), (w1, 51), ..-, (Wm; Sm), ---. We define to = 0. The initial 
value clearly satisfies Definition 6. Let us assume inductively that we have defined 
to, ..., t; satisfying Definition 6; we define t;,1 as follows: 


e if (wi, si) >p (Wi+1, 9441), then ti+1 = t; + 2; 
e if (wi, Si) =Q (Wi+1, Si41)s then ivi = min{t 0; | Fae) = Si+1}- 
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In the first case we see that we still satisfy Definition 6, since (w;, s;) is in the acti- 
vation region of (w;, w;+1) and after two time units the system H reaches the point 
reachable with the discrete jump. As far as the second case is concerned, we get the 
same conclusion as a consequence of the facts that we are considering autonomous 
systems and that the trace Tr((v, r}) is in normal form (hence the next transition is 
discrete). 














Thus, we conclude that cyclic traces of H agree with periodic orbits of H. 


Corollary 1. Let H be a deterministic functional hybrid automaton. H admits a 
cyclic trace if and only if H has a periodic orbit. 


Notice that if the second condition of Definition 2 fails, for example, because the 
flows can have either no solution or more than one solution, then Lemma 2 is false. 
Nonetheless, we can still construct H and prove correspondence between traces of 
H and solutions of H. 


3.5 Related Literature 


To place the results described here in the context of a growing literature, we 
mention a few related results. 

The closest in spirit to our results are those in [18]. There, hybrid automata are 
studied from a dynamical systems perspective. The paper rigorously proves neces- 
sary and sufficient conditions for existence, uniqueness, and continuity of traces. 
Under these assumptions, Lyapunov’s theorem on stability via linearization and 
LaSalle’s invariance principle are generalized to hybrid automata. While our no- 
tion of deterministic functional hybrid automata is intuitively similar to the notion of 
deterministic hybrid automata introduced in [18], there are many fundamental dif- 
ferences: we do not impose that the flows are globally Lipschitz continuous, but we 
assume that they have a unique solution for each initial condition; we impose on the 
resets an injectiveness condition. When the flows of a deterministic functional hybrid 
automaton H are globally Lipschitz continuous all the results proved in [18] apply to 
H. In the general case we can map H into the dynamic system H and try to directly 
apply stability and invariance results to H. 

In [11] hybrid systems are defined as sets of systems of differential equations. 
Which system has to be used is decided by the initial conditions and by a discrete 
control. On these hybrid systems, stability conditions are studied explicitly. The sys- 
tems in [11] are not continuously linked in the following sense: when there is a switch 
in the discrete part, there is a jump in the continuous part, hence stability results for 
dynamic systems cannot be directly applied. The main difference with our construc- 
tion is that we connect the flows continuously so that we get a piecewise defined 
dynamic system. 

In [1] an affine hybrid automaton H is mapped into a new automaton Bi(H) 
which has the same periodic orbits and equilibrium points, but no Zeno behaviors. 
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The basic idea behind the mapping is to split each control switch adding a new con- 
trol mode and to introduce a time delay in the new modes. This is similar to what we 
do in our construction when we use 2 time instants for each edge crossing. In fact, 
we can prove that the Zeno behaviors of H corresponds to solutions of H in which 
the time flow is unbounded. 

In [9] domains of convergence are studied by mapping systems of differential 
equations into discrete automata with an infinite number of states. By combining the 
construction we describe in this chapter with that defined in [9] we get a discretiza- 
tion method for hybrid automata. Relationships with other discretization methods 
(e.g., [5, 23]) remain to be analyzed. 


3.6 Conclusion 


Finally, we return to the biological questions that initiated this journey into the 
stability of hybrid automata. At present, we lack the ability to analyze all but the sim- 
plest regulatory structures composed of a handful of genes and we have no means of 
even intelligently conjecturing what universal principles unify biology. Our notions 
of biological robustness and arguments in its favor are often anecdotal, speculative 
and unsupported by data. For instance, there have been raging debates about the 
nature of the robustness exhibited by a circadian clock model that is composed of 
analogs of both PER and TIM, but also taking into account the reality that the copy 
number of PER-TIM complexes can only assume a small and random number. For 
instance, in the work of Naama Barkai and Stan Leibler [8], they speculate existence 
of an unmodeled hysteresis mechanism in circadian clock models to confer on it 
some degree of robustness. And yet, there are others, who using similar simulations, 
have argued that the original model is already robust as it is. Clearly, if the truth 
must be found, it will need formal methods that no amount of simulation can deliver. 
Pravin Varaiya’s insights and instincts, buried among his results on engineering hy- 
brid systems, may provide the methods we seek to solve such problems in systems 
biology. 
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Summary. This chapter gives a survey of the theory of square-integrable martingales and the 
construction of basic sets of orthogonal martingales in terms of which all other martingales 
may be expressed as stochastic integrals. Specific cases such as Brownian motion, Lévy pro- 
cesses and stochastic jump processes are discussed, as are some applications to mathematical 
finance. 


Key words: Stochastic integral, martingale, Lévy process, mathematical finance 


4.1 Introduction 


I have (so far) co-authored three papers with Pravin Varaiya [11],[12],[13]. The 
first one [11] concerns linear systems and is, I believe, the first paper anywhere to 
use weak solutions of stochastic differential equations in a control theory context. 
Our best-known paper is certainly [12] which treats stochastic control by martingale 
methods and gives a result sometimes referred to as the Davis—Varaiya maximum 
principle. The third paper [13] is the Cinderella of the set and has more or less dis- 
appeared without trace. It concerns the multiplicity of a filtration — an attempt to 
characterize the minimal number of martingales needed to represent all martingales 
as stochastic integrals. While our paper may have disappeared, interest in questions 
of martingale representation certainly has not. In particular the martingale represen- 
tation property is equivalent to the very fundamental idea of ‘complete markets’ in 
mathematical finance. For this reason it seems time to rescue Cinderella from obscu- 
rity and invite her to the ball. 

The setting for this chapter is the conventional filtered probability space of mod- 
ern stochastic analysis. The reader can consult textbooks such as Øksendal [20], 
Protter [24] or Rogers and Williams [25] for background. We let (2,7, P) be a 
complete probability space and (F;)o<t<oo be a filtration satisfying les conditions 
habituelles. We assume Fao = F. We denote by M the set of square-integrable 
F;-martingales, i.e., MW € M if M is a martingale, Mo = 0 and sup, 1M? < oO. 
M, is the set of M € M such that the sample path t ++ M (t, w) is continuous for 
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almost all w. M!°°, M!°¢ denote the set of processes locally in M, Me. A process 
X is càdlàg if its sample paths are right-continuous with left-hand limits; we write 
AX, = Xs — Xs. 

The next section introduces the Lz theory of stochastic integration, while Sec- 
tion 4.3 describes the Hilbert space structure of the set of square-integrable martin- 
gales, including the Davis—Varaiya results [13] The standard Brownian motion case 
is covered in Section 4.4, while Section 4.5 describes the very striking Jacod—Yor 
theorem relating martingale representation to convexity properties of the set of mar- 
tingale measures. 

In recent years, Lévy processes have become widely used in mathematical fi- 
nance and elsewhere, and in Section 4.6 we summarize results of Nualart and 
Schoutens giving a basis, the so-called Teugels martingales, for square-integrable 
martingales of a certain class of Lévy processes. If the Lévy process has no diffu- 
sive component and a Lévy measure of finite support, then it reduces to a rather 
simple sort of stochastic jump process. But martingale representation theorems are 
available for jump processes in much greater generality; we summarize the theory in 
Section 4.7. Concluding remarks are given in Section 4.8. 


4.2 The Battle of the Brackets 


As is well known, the quadratic variation of the Brownian path W; over the in- 
terval [0,t] is equal to t, and the second-order term in the It6 formula arises from 
the ‘multiplication table’ entry (dW;)? = dt. When we move to more general mar- 
tingales such as M € M there are two candidates to replace ‘dt’. The first is the 
‘angular brackets’ process < M >, introduced by Kunita and Watanabe [21], the 
existence of which is a direct application of the Meyer decomposition theorem. In- 
deed, for M € M the process M? is a submartingale and < M >, is defined as the 
unique predictable increasing process such that < M > 9= 0 and M?— < M >; is 
a martingale. For M, N € M the cross-variation process < M,N >, is defined by 
polarization: 


1 
<M,N >= 7 (<M+N>,—-<M—N >). 


(In particular, < M,M >:=< M >.) The process < M > defines a posi- 
tive measure on the predictable o-field P in (0,00) x 92 by the recipe < M > 
(F) = TT tices Ir(t,w)d < M > We denote by L2(< M >) the corre- 
sponding Lə space, i.e., the set of predictable processes ¢ satisfying E i $d < 
M >s< œ. The stochastic integral f ¢dM is characterized in very neat fashion for 
$ € Ləa(< M >) as the unique element of M satisfying 


























t 
< f ġdM, N >= f o,d<M,N>,, t>0 (4.1) 
0 


for all N € M. Let Z be the set of simple integrands, i.e., processes ¢ of the form 
br(w) = iy Zi(w)]ys,,7,(t, w) for stopping times S; < T; and bounded Fs,- 
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measurable random variables Z;. For these integrands the stochastic integral is de- 
fined in the obvious way as 


I dM = X Zi(Mr, — Ms,) 
i=l 


and we have the Itô isometry 


2 
o ( f oam) =E/ ğd<M»». 


Ry 


























The integral may now be defined by continuity on the closure of Z in L2(< M >), 
which is equal to L2(< M >) itself, and then (4.1) is satisfied. 

In recent times the angular bracket process has generally been superseded by the 
‘square brackets’ process |M]; characterized by the following theorem!. 
Theorem 1. For M € M there exists a unique increasing process |M]: such that (i) 
Mo = 0, (ii) M? — |M]: is a uniformly integrable martingale and (iii) A[M], = 
(AM;)? for t € (0,00). 


If M € Me then [M] =< M >. Any M € M can be decomposed into M = 
M°+M4 where M° € M, and M? is ‘purely discontinuous’ (further details below). 
Then 

[M]: =< M° >, +5 —(AM)?. 


s<t 


If M ¢ Me then S; = 5>,-,(AM)? is an increasing process and, trivially, a sub- 
martingale, so it has the Meyer decomposition S; = U+ + V; where U; is a martingale 
and V; is a predictable increasing process, the so-called dual predictable projection 
of S+. We have 

< M >=< M° > +V, 


and hence [M],— < M >= U;, a uniformly integrable martingale. 
Stochastic integrals can now be defined à la Kunita—Watanabe, but based on the 
square brackets process. We define 


[M, N] = (IM + N] — [M — N]). 


The appropriate class of integrands is Lə([M]), the set of predictable processes ¢ 
satisfying E [S° ¢2d[M], < co. 














Theorem 2. For M € M and ġ € L2({M)}) there is a unique element f ¢dM € M 
such that |f ¢dM, N]; = k osd[M, N]s for all N € M. Further, A( f ¢dM),; = 
pi AM. 


1 See Rogers and Williams [25], Section IV.26 
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When restricted (as here) to predictable integrands, the integrals defined by (4.1) and 
by Theorem 2 are the same. Indeed, they clearly coincide on the set Z of simple 
integrands and a monotone class argument shows that Lo( < M >) = L2([M]). 
The main reason for preferring |M] to < M > is universality: [M] is well-defined for 
every local martingale M, but not every local martingale is locally square integrable 
as required for the definition of < M >. A further disadvantage of < M > is that it 
is not invariant under mutually absolutely continuous measure change. See page 123 
of Protter [24] for a discussion of these points. 

In spite of the above, for a discussion limited to M!°° the angular brackets pro- 
cess has some appeal. For instance, as we see below, (strong) orthogonality of M 
and N is equivalent to < M,N >= 0. It seems much more intuitive to say that 
two objects M,N are orthogonal when some bilinear form is equal to zero than 
when [M, N] is a uniformly integrable martingale, which is the equivalent statement 
couched in square bracket terms. For these reasons we prefer to use < M > in the 
following sections. 


4.3 M as a Hilbert Space 


The martingale convergence theorem implies that each M € M is closed, i.e., 
there is an F,,-measurable random variable Moo such that M —> Mə in Lə and, 
for each t, M; = E[M.,,|F;]. Thus there is a one-to-one correspondence between 
M and L9(2,F, P), so that M is a Hilbert space under the inner product M - N = 
{[ Moo Noo]. We say that H is a stable subspace of M if M € H => f¢dM E€ H 
for all ¢ € La(< M >). If H is a stable subspace, then so is 

Ht = {Y €e M: Y 1 X forall X € H}. The stable subspace gen- 
erated by M is S(M) = {f[¢dM : 6 € Lə(< M >)}. It turns out that 
N L S(M) @< M,N >= 0. More generally, the stable subspace S( A) generated 
by a subset A C M is the smallest closed, stable subspace containing A. The set of 
continuous martingales Me C M is a stable subspace. Its orthogonal complement 
Ma is the set of ‘purely discontinuous’ martingales. 

The Hilbert space structure gives us a way of obtaining an abstract ‘martingale 
representation theorem’, stated as follows. 


























Theorem 3. Suppose Lo(N, F, P) is separable. Then there exists a sequence M;, i = 
1,2,...in M such that < M;,M; >= 0 fori # j, and any X € L2(2,F,P) can 


be represented as œ poo 
X= > ¢i(s)dM,(s), (4.2) 
i=1 0 


for some sequence Q; € Lo(< Mi; >). 


The construction of ġ;, M; in (4.2) is straightforward. Let Y;, i = 1,2,... bea 
countable dense subset of L2(.2,F,P), and set Mı = Yı. Now let M2(co) be the 
projection of Y> onto S(M,)+ and define M2(t) = E[M2(oo)|F;]. Then S(M1) L 
S(Mz). We now define M3(0o) as the projection of Y3 onto (S(M,) ®@ S(M2))+. 
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Continuing in this way we obtain a sequence of mutually orthogonal subspaces 
S(M;) such that 


2(2,F,P) = Qs 


The representation (4.2) follows. 

Theorem 3 shows that, as long as L2(2,F .,P) is separable, there is always 
a countable sequence M1, M2,... C M such that M = S(Mj, Mo,...). The 
question of interest is whether there is a finite set A = (M1, ..., Mp) such that 
M = S(A) and, if so, what is the minimum number k. Such a set is said to have 
the predictable representation property. This property has acquired a new signifi- 
cance in recent times in connection with mathematical finance, where A models a 
set of price processes of traded financial assets, integrands ¢; are trading strategies 
and stochastic integrals represent the gain from trade obtained by using the corre- 
sponding strategy. If a set of assets A is traded and these assets have the predictable 
representation property, then the market is complete, implying that there are uniquely 
defined prices for all derivative securities. See, for example, Elliott and Kopp [16] 
for an explanation of these ideas. 

Davis and Varaiya considered the characterization of k in the 1974 paper [13]. 
Recall that the angular bracket process < M > is identified with a positive measure 
on the predictable o-field P in (0,00) x 2 by defining 














<M>(F)= | Ir(tw)\d< M >. (4.3) 
(0,00) 


The notation < M >>< N >or < M >< N >, signifies that the measure 
< N > is absolutely continuous with respect to, or equivalent to, < M >. We 
obtained the following results. 


Theorem 4. Suppose M = S(Mı, M2, ..., Mp) where k < œo (k = œ denotes 
that the M; sequence is countably infinite). Then there exists a sequence N,,..., Ni 
in M, with l < k and N, = M,, such that 

(i) S(Ni,... N) = S(Mı,... iMg); 

(ii) S(N, \LSN G), j £ i and 

(iii)< Ni >> < N2 Dp erd 


Theorem 5. Suppose M = S(M,,..., My) = S(N1,..., Ni) and that 
(ii)< Mı >> < Mə >> --- and < Ni >> < No >> 
Then < M; >~ < N; > for alli, and in particular k = l. 


These theorems imply that there is a unique minimal cardinality for any set of mar- 
tingales with the predictable representation property. We call this number the mul- 
tiplicity of the filtration F; (following earlier work on the gaussian case by Cramér 


[6]). 
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4.4 The Brownian Case 


This is the classic case, solved by K. It6 [17]. We take (2, F, (F):, P, (W;)) to 
be the canonical Wiener space, so that W; is Brownian motion and F; is the natural 
filtration of W,. Of course, W, has continuous sample paths and < W >,= t. The 
Lévy representation theorem states that Brownian motion is the only martingale with 
these properties. 


Theorem 6. X € L2(2, Fæ, P) if and only if 














X= x+ f pidWe, 
0 











where ġ is an adapted process satisfying E a ġ?dt < œ. 





The most straightforward proof of this theorem is the one given by Øksendal [20]. 
For n = 1,2,... let Gy = o{Wkjon,k = 1,2,...,2°”}. Then Gn is increasing 
and VT Gn = Fæ. It follows from this and the martingale convergence theorem 
that if X € La(Q, Fæ, P), then X, — X in Lz where X, = E[X|Gn]. The 
theorem is therefore proved if we can ‘represent’ X,,, which takes the form X,, = 
h(Wi,,-.-,W4,,) for some Borel function h : R™ — R. Xn can be approximated 
in L in the standard way by random variables X,, = h(W;,,..-, W:,,) in which h 
is a smooth function of compact support. A stochastic integral formula for Xn can 
be written down in a fairly explicit way, just by using the It6 formula and elementary 
properties of the heat equation. See Davis [9] or Exercise 4.17 of Øksendal [20] for 
details of this construction. 

A very neat alternative proof was devised by Dellacherie [14] (see also Davis 
[7]). The theorem is equivalent to the implication X € S(W)+ > X = Oas. 
Suppose X L S(W), let n = inf{t : |Xz| > 1/n} and define 

1 
Ay =1+ Inthe: 
Since all martingales of the Brownian filtration are continuous,” A”, > 0 a.s. and we 
define a measure Q, equivalent to P by dQ,,/dP = A®,. Now A" — 1 € S(W)+, 
so that WA” and is a P-martingale, implying that W is a Q,,-martingale and hence 
(by the Lévy theorem) a Q,,-Brownian motion. Thus Qn and P coincide on Fo, 
implying that X,,, = 0 a.s. and therefore that X = 0 a.s. 














4.5 The Jacod-Yor Theorem 


In Theorems 4 and 5 we thought of the predictable representation property as 
being a characteristic of the filtration F+. Alternatively, we can think of this property 


2 We need to establish this property without appealing to the representation theorem! In 
[7], measure change arguments are used twice, first to establish continuity of martingales 
and then, as here, to get the representation property. See Section V.4 of Revuz and Yor’s 
Continuous Martingales and Brownian Motion for another direct proof of the continuity 


property. 
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in relation to the measure P in the underlying probability triple (2, F, P). The argu- 
ment given at the end of the last section gives a hint as to why considering alternative 
measures might be a fruitful thing to do. 

For A C M, denote by M(.A) the set of probability measures Q on (2, F) such 
that each M € A is a square-integrable Q-martingale. Clearly, M(.A) is a convex 
set. Q € M(.A) is an extreme point if Q = AQı + (1 — A)Q2 with Q1, Q2 E M(A) 
implies À = 0 or 1. 


Theorem 7 (Jacod—Yor [19]). Let A be a subset of M containing constant martin- 
gales. Then S(.A) = M if and only if P is an extreme point of M( A). 


This is Theorem IV.57 of Protter [24]. The proof is too lengthy to describe in detail 
here, but we can show why extremality is a necessary condition. Indeed, suppose P 
is not an extreme point; then P = AQ; + (1 — A)Q2 for some Q1, Q2 € M(A) and 
A €]0,1[. Let L; = E[dQ /dP|F;]. Then 1 = ALa + (1 — A)dQ2/dP > AL, so 
Leo < 71 a.s. Hence Ly = Li—Ly € M.If X € S(A) then X is a Qi-martingale, 
so for any s < t and bounded F,-measurable H, 
































Ep|X~ LH] = Ep[|X:LxH] = Eg, [X:H] = Eg, [X-H] = Ep|X5L;H], 
































so X L is a P-martingale. Hence XLisa martingale, so that < X, L >= 0. Since 
X is arbitrary, L  S(A), so it cannot be the case that S(A) = M. Note that this 
argument is very close to Dellacherie’s proof of the Brownian representation theorem 
given above in Section 4.4. 

Of course, P is an extreme point of M(.A) if M(A) = {P}, and this is the 
way Theorem 7 is generally used in mathematical finance. The ‘first fundamental 
theorem’ of mathematical finance states (very roughly) that absence of arbitrage op- 
portunities is equivalent to existence of an equivalent martingale measure (EMM), 
i.e., a measure Q under which each M € A is a martingale, where A is the set of 
price processes of traded assets in the market model. The ‘second fundamental theo- 
rem’ states that the market is complete if there is a unique EMM. But this is (modulo 
technicalities) just an application of the Jacod—Yor theorem, since ‘completeness’ is 
tantamount to the predictable representation property. Thus the Jacod—Yor theorem 
is one of the cornerstones of modern finance theory. 


4.6 Lévy Processes 


Lévy processes have been around since — obviously — the original work of 
Paul Lévy in the 1930s and 1940s, but have recently been enjoying something of 
a renaissance, fueled in part by the need for asset price models in finance that go 
beyond the standard geometric Brownian motion model. The quickest introduction 
is still Section I.4 of Protter [24] (carried over from the 1990 first edition), but some 
excellent textbooks have recently appeared, including Applebaum [1], Bertoin [3], 
Sato [26] and Schoutens [27]. There is also an informative collection of papers edited 
by Barndorff-Nielsen et al. [2]. 
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A process X = (X;,t > 0) is a Lévy process if it has stationary independent 
increments, Xo = 0 and X; is continuous in probability. The probability law of X 
is determined by the 1-dimensional distribution of X; for any t > 0, and this has 
characteristic function 














ere] = eth) 


where (u) is the log characteristic function of an infinitely-divisible distribution. 
The Lévy-Khinchin formula shows that 7 must take the form 


CO 


1 ; 
y(u) = iau — AA +f (e"* — 1 — iurl)g) <1) v(dz), 
—oo 
where a,c are constants and the Lévy measure v is a measure on R such that 
v({0}) = 0 and 


fa ^ a?)v(dx) < o. (4.4) 
R 


If v = 0 then X is Brownian motion with drift a and variance parameter o°. The 
interpretation of v is that if A C R is bounded away from 0 and N4 (t) denotes the 
counting process N4(t) = } -<+ 1(ax,ea), then N4 is a Poisson process with rate 
v( A). The integrability condition on v implies that the total jump rate is generally 
infinite and jumps occur at a dense set of times, although, for any € > 0, jumps of 
size greater than € occur at isolated times. Protter [24] shows that every Lévy process 
has a cadlag version. The sample paths have bounded variation if and only if o = 0 
and 


fa A |z|)v(dz) < o. (4.5) 
R 


The Lo theory of Lévy processes is explored in a beautiful little paper by Nualart 
and Schoutens [23], on which this section is mainly based. The condition on the 
Lévy measure is 


| el*ly(dx) < oo for some €, À > 0. (4.6) 
R\(—e,e€) 


Condition (4.6) implies that X; has moments of all orders, and that polynomials are 
dense in (R, u+), where ju; is the distribution of X+. A convenient basis for martingale 
representation is provided by the so-called Teugels martingales, defined as follows. 
We set x = X; and for į > 2, 


X= >) (Ax) 


O<s<t 











Then EX) = m;t where mı = a and m; = I x'v(dx) for i > 2. The Teugels 
martingales are 





YO =x —mit, i=1,2,..., 


the compensated power jump process of order i. Let 7 denote the set of linear com- 
binations of the Y (°. The angular brackets processes associated with the Teugels 
martingales are 
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<Y® YO >= (Mij +0°lu=j=1)) t. (4.7) 


Let R be the set of polynomials on R endowed with the scalar product 
< pay = | plo)ala)a?v(da). 
R 


Then we see that z=} — Y is an inner product preserving map from R to T, so 
any orthogonalization of {1, x, x7, . . .} gives a set of strongly orthogonal martingales 
in T. In particular we can find strongly orthogonal martingales H® € T,i = 1,2... 
of the form 

H® = y® + deg VO a ass AE a; Y®. 


In view of (4.7) the measures associated with the compensators < H © > by (4.3) 
are all proportional to the product measure dt x dP and hence these measures are all 
equivalent (as long as H + 0). 


Theorem 8. The set {H D HO,.. .} has the predictable representation property, 
ie., any F € La(N, Fæ, P) has the representation 


F=EF+) | oi(t)dH{” 
i=1 79 














for some predictable processes ġ; such that 














Ef 3 (t)dt < 00. 
0 


The proof given in Nualart and Schoutens [23] proceeds by noting that polynomials 
of the form xf (Xia — Xa)" .. . (Xin — Xt,_,)*" are dense? in L2(2, Fæ, P), 
and obtaining a representation of these polynomials using stochastic calculus. 

An interesting special case is as follows. 


Corollary 1. Suppose o = 0 and that the Lévy measure v has finite support 
{a1,a2,...,an}. Then A = {H®, H®),..., H™®} has the predictable represen- 
tation property. 


This is equivalent to saying that, under the stated condition, H® = 0 for k > n. 
This fact is essentially due to non-singularity of the Vandermonde matrix 


Tara? nacat 
1 az a2 et T 
Lin ee a 


It follows from Theorem 4 and Theorem 5 that n is the minimum number of martin- 
gales having the predictable representation property in this case. 


3 Incidentally, this shows that L2 (2, Foo, P) is separable. 
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4.7 General Jump Processes 


There is a simpler way to look at the case described above in Corollary 1. Indeed, 
we can write the process X; as 


Xi = ai Nı(t) +. ++ anN,(t), 


where the processes N;(t), defined by Ni(t) = >> ,<,1(ax,=a;), are independent 
Poisson processes with rates A; = v({a;}). We have — 


S(HY, HO, cs ,H™) = S(Ñı, Ea Nn) 


where Ñ; is the compensated point process N;,(t) = N;(t) — Ait, so the predictable 
representation property can equally well be expressed in terms of integrals with re- 
spects to the Ñ; processes. 

However, results of this sort are true in much greater generality: the representa- 
tion of martingales of jump processes was investigated in a series of papers in the 
1970s by, inter alia, Jacod [18], Boel, Varaiya and Wong [4], Chou and Meyer [5], 
Davis [8],[10] and Elliott [15]. In particular we do not need the Markov property, 
and can allow for processes taking values in much more general spaces. We follow 
the description in the Appendix of [10]. 

A stochastic jump process is a right-continuous piecewise-constant process X; 
taking values in = U {A}, where © is a Borel space and A an isolated ‘cemetery 
state’. We take a point Zo € = and on some probability space (9, F, P) we define a 
countable sequence of pairs of random variables (Sp, Zk) € Y,k = 1,2,..., where 
Y =R, x £. We then define Tk = De S; and T,, = limMm¿—oo Tk and define the 
sample path X; by 


Zo 0<t< T 
Xi = | Zk Tr St < Tiy. 
A, t> To 


The law of (X+) can be specified by giving a family of conditional distributions 
uk : Y*-! — Prob(Y) (here Y° = Ø). For simplicity of exposition, let us assume 
that To = œ as. We let FX = o{X,,0 < s < t} be the completed natural 
filtration. 
For A € B(=) let 
p(t, A) = Ñ. 1x2, €4); 


Ti<t 
and let p(t, A) be the predictable compensator of p, easily defined in terms of the 
family of transition measures up, such that g(t \T;,, A) = p(t Ty, A) —D(tATh, A) 
is a martingale for each k, so q(t, A) is a local martingale. Stochastic integrals with 
respect to q are defined pathwise by 


t 


t 
ag = f g(s.2,)a(ds, dx) = | 
0 


t 
g(s.2,)p(ds, de) — | g(s, x, w)p(ds, dx). 
0 0 


The appropriate class of integrands is 
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co 
Liee(p) = fs : g is predictable and 1 lg|li<r, dp < co} . 
0 


Here Tn is a sequence of stopping times Tn | oo a.s. The martingale representation 
theorem is the following. 


Theorem 9. M; is a local ¥‘ -martingale if and only if M, = M? for all t a.s. for 
some g € Li°°(p). 


This is Theorem A5.5 of Davis [10]. The proof is a more-or-less bare hands calcula- 
tion using methods initiated by Dellacherie and by Chou-Meyer [5]. An Lo version 
is given by Elliott [15]. 


4.8 Concluding Remarks 


Martingale representation has been a recurring theme in stochastic analysis ever 
since the pioneering work of K. It6 [17] for the Brownian filtration. The results have 
proved to be of key importance in several application areas, for example non-linear 
filtering and mathematical finance, and continue to be the inspiration for further de- 
velopments, most particularly in connection with Malliavin’s calculus on Wiener 
space (see Nualart [22]). We hope the reader will find this short survey useful in 
providing some background and context for this continually fascinating corner of 
stochastic analysis. 
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5.1 Introduction 


Engineers have a major advantage over scientists. For the most part, the systems 
we analyze are of our own devising. It has not always been so. Not long ago, the prin- 
ciple objective of engineering was to coax physical materials to do our bidding by 
leveraging their intrinsic physical properties. The discipline was one of “applied sci- 
ence.” Today, a great deal of engineering is about coaxing abstractions that we have 
invented. The abstractions provided by microprocessors, programming languages, 
operating systems, and computer networks are only loosely linked to the underlying 
physics of electronics. 

The rapid improvements in the capabilities of electronics during the last half of 
the 20-th century are, in part, the reason for this separation. The physical constraints 
imposed by limited memory, processing speed, and communication bandwidth ap- 
peared to evaporate with each new generation of computers. What appeared to one 
generation as luxuriously inefficient abstractions became the bread and butter of the 
next generation. The separation of “computer science” from “electrical engineering” 
is both a consequence and a cause, fueling the separation and reflecting it at the same 
time. 

At the same time, the systems science that was incubated in the study of elec- 
tronic circuits (control systems, communications theory, and signal processing) has 
also become more abstract. Although these disciplines were created by true “electri- 
cal engineers” (“true” means that they were engaged with electrical systems), many 
of the practitioners today rarely encounter electricity directly. Their techniques are 
often realized in “embedded” software, ironically building on the abstractions that 
are only loosely connected to the electronics that their theory originally helped to 
create. The theories, however, have not adapted as well as one might hope to the 
world of software. Perhaps these theories remain too wedded to their physical her- 
itage. 

Computer scientists lament that the engineers who write embedded software use 
so few of the beautiful abstractions they have built. They write their code in C (or 
even in assembly code), using low-level (less abstract) mechanisms. They ignore 


70 E.A. Lee 


advances in object-oriented design, memory management, operating systems, and 
programming languages, and instead directly interact with memory-mapped registers 
that set up timer interrupts, provide interrupt service routines, and build application- 
specific task schedulers. Wouldn’t it be nice if they would just learn to use the modern 
technology, and set up instead an HTTP server in Java? Or a peer-to-peer network of 
embedded sensor and actuator components that discover each other’s capabilities via 
JXTA? The problem is that the modern technology does not talk about the properties 
of the system that they have to control, such as timing. 

On the other hand, the information technology revolution of the late 20-th century 
was greatly accelerated by advancing computing abstractions. The Internet and the 
Web are not principally electronic systems. They are conceptual frameworks. The 
financial, economic, and social systems built on top of them have transformed our 
cultural landscape. But there is a weakness. While the computing abstractions we 
have built are extremely well suited to the management of information, their very 
divorce from the physics makes them less well suited to the management of our 
physical environment. This is the key reason that these abstractions have had less 
impact in embedded software.! 

It seems likely that embedded computing is the next transformational revolution. 
Although it may seem that computers are already everywhere, the real potential is 
vastly greater than what we have today. The National Research Council’s report Em- 
bedded Everywhere [4] summarizes this view in the introduction: 


“Information technology (IT) is on the verge of another revolution. Driven 
by the increasing capabilities and ever declining costs of computing and 
communications devices, IT is being embedded into a growing range of 
physical devices linked together through networks and will become ever 
more pervasive as the component technologies become smaller, faster, and 
cheaper... These networked systems of embedded computers ... have the po- 
tential to change radically the way people interact with their environment by 
linking together a range of devices and sensors that will allow information 
to be collected, shared, and processed in unprecedented ways. ... The use of 
[these embedded computers] throughout society could well dwarf previous 
milestones in the information revolution.” 


Sensor networks and “smart dust” [7] are only just breaking out of being labora- 
tory curiosities, but their successes to date imply that the electronics technology 
scales, and that leveraging advances in sensors, actuators, and wireless networking 
will make possible (and probable) a pervasiveness of computing that we can only 
dream of today. 

I am convinced, however, that the embedded revolution will require a reexam- 
ination, and probably a reinvention of some of the core abstractions of computing 
and systems engineering. All effective abstractions hide properties of the underlying 


1 Tt is often assumed that real reason is that embedded software faces more severe resource 
constraints than general purpose software. But as I have pointed out, resource constraints 
have repeatedly evaporated with each new generation of computers, and yet the practice of 
embedded software has changed remarkably little. 
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systems, but the key to their effectiveness is that they hide the right properties. The 
divorce of computing from physics has to end for this embedded revolution to take 
hold. 

It is not only the abstractions of computing that have to adapt. Embedded comput- 
ing will also require a reexamination and reinvention of the core abstractions in the 
more physics-based engineering disciplines. The models that are used in civil, elec- 
trical, and mechanical engineering are deeply rooted in the interactions of physical 
devices, and poorly express the interactions of those physical devices with comput- 
ing. 

Consider a simple example. A physics-based model of a power distribution sys- 
tem will describe voltages and currents as a function of time, giving their dynamics 
as ordinary differential equations. The time variable, t € R, is universal. Its value in 
Schenectady is the same as its value in San Francisco. But the software embedded 
in the control system for the power network has tremendous difficulty maintaining a 
common time base across a distributed system. Even with technologies such as GPS 
(which provides atomic clock timing precision worldwide), building software that 
works tightly in concert over geographically distributed systems is extremely diffi- 
cult. In fact, in the abstractions used to build the software, time is not a part of the 
ontology. No wonder the engineers who build this software are stuck working with 
very low-level mechanisms. 

Another example is more technically difficult: dealing with random behavior. In 
standard computing abstractions, we have had the luxury of largely not having to 
worry about this. This is partly because electronics technology (with some algorith- 
mic help from coding and communication theory) has delivered amazing reliability. 
Consider the fact that a 40 Gbyte hard disk can be copied flawlessly. This requires 
that the electronics process 320 billion bits without error. And operations like this 
occur by the millions on a daily basis. But when we switch our attention to embed- 
ded computers with energy scavenging and wireless communication, it is probably 
too much to expect such reliability. The computing abstractions will have to adapt. 

The engineering of systems that are composed of both physical and computa- 
tional components must be based on abstractions that embrace both physics and 
computation. There is huge potential for a new “systems science,” and there are a 
few visionaries exploring it. But the cultural divide between computing and engi- 
neering is a major barrier to progress. We must break down that barrier. 


5.2 Feedback Control, Hybrid Systems, and Beyond 


A computational systems theory must, of course, build on both theories of com- 
putation and classical systems theories [3]. Ideally, it identifies the common founda- 
tions, like theories of composition of components. For example, classical feedback 
control theory, as illustrated in Fig. 5.1, builds on a key insight, dating back to the 
1930s, that feedback systems can be effectively modeled self-referentially, using an 
abstraction of instantaneous feedback. At its roots, this principle rests on topological 
fixed point theory, the same foundation underlying recursion theory (a foundation 
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Fig. 5.1. Illustration of the mathematical tools of classical feedback control systems (from 


[14]). 


for many modern programming languages) [6] and many theories of concurrency 
(e.g. the synchronous languages [2] and process networks [8]). It is extremely rare, 
however, for engineering students (or even engineering faculty) to be even aware of 
these commonalities. That these commonalities are not exploited in the curriculum 
is a consequence of the cultural divide that we created in the 20-th century between 
engineering and computer science. 

On the engineering side, we often misrepresent to our students that the connec- 
tion between the physical world and the world of software is simply a matter of 
discretizing time. As long as we respect the Nyquist sampling theorem, everything 
will be fine. Regrettably, software does not perform with the clock-tick regularity 
of discrete-time abstractions. And even if it did (or if we use tricks to achieve a 
reasonable approximation), the systems we build in software are far more complex 
than those we used to build with resistors, capacitors, and inductors. The linear-time- 
invariant abstraction that underlies so much of the pertinent systems theory is simply 
not applicable. No wonder engineers using embedded software are stuck with bench 
testing as their principal analysis tool. 

The theory of hybrid systems (see for example [5, 10, 16]) is relatively recent 
example of a modern systems theory, one that combines computation with classical 
systems theory. It combines the continuous-time world (or its discretized versions) 
with the world of irregularly timed mode transitions. It provides analytical tools that 
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are rooted in both linear-time-invariant systems and automata (see Fig. 5.2). It lever- 
ages theories of computation to achieve decidability results (or, more commonly, 
undecidability results) [9], and theories of feedback control to study dynamics and 
stability. Much of the pioneering work in this area was carried out by teams that 
included both computer scientists and electrical engineers. 
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This model shows two (independent) control loops whose controllers share the same CPU. The control 
loops are chosen such that it is unstable if the control signals are constantly delayed. By choosing 
different priority assignments and TM scheduling policies, different stability of the two loops may 

appear. For example, a nonpreemptive scheduling can stablize both control loops, but none of the 
preemptive ones can. 
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Fig. 5.3. Model of two software-based controllers executing on a single computer under the 
control of a real-time operating system, where the controllers are attempting to each stabilize 
an unstable plant (after [15]). 
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However, the current intellectual formulation of hybrid systems has its limita- 
tions. It still relies on a model of time that poorly fits what software does. Consider 
a simple example, due to Jie Liu [15], where two software-based controllers execute 
on a single computer under the control of a real-time operating system (RTOS). A 
model of such a system is shown in Fig. 5.3. A typical RTOS will offer schedul- 
ing policy alternatives, such as preemptive or non-preemptive multitasking, and will 
support the assignment of priorities to tasks. Under the formulation in Fig. 5.3, if the 
scheduling policy is preemptive multitasking, only one of the two feedback loops can 
be made stable (which one depends on the relative priorities). Under non-preemptive 
multitasking, both can be made stable. This difference is extremely hard to explain 
using classical control theory. If such a simple system renders our analytical tools 
useless, then engineers are forced to reject either the implementation technology or 
the analytical tools. In the former case, an engineer might choose to not share the 
same computer for the two control loops in order to be able to rely on the analytical 
results. In the latter case, an engineer will bench test the system to verify stability, 
tweaking priorities and scheduling policies until the desired behavior is achieved 
experimentally. Neither outcome is particularly attractive. 

When our analytical tools break down even for such small, localized systems, 
how can we expect them to perform for large-scale distributed systems? The lack of 
an effective temporal abstraction in software is a major limitation. The tight binding 
of a universal time continuum with control theory is an equally major limitation. The 
future of systems theory is going to have to offer better time and concurrency ab- 
stractions that yield to both formal analysis and distributed and concurrent software 
realizations. 


5.3 A Focus on Systems 


A few years ago, Pravin Varaiya, David Messerschmitt and I led an effort at 
Berkeley that started down the road of updating the curriculum in the EECS depart- 
ment. We began with our outdated introductory curriculum in EE, where an “‘intro- 
duction to electrical engineering” was principally about passive analog circuits. The 
rationale for the changes is described in [11], where we cite the considerable work 
of others that influenced our thinking. A truly long term (and highly speculative) 
vision is laid out in [12]. The first concrete outcome of this work was a new intro- 
ductory course on systems [13] and a supporting textbook [14]. Despite this modest 
progress, the vision remains incomplete and unfulfilled. Academic institutions have 
considerable inertia. 

A unifying theme in these efforts is an increased focus on systems rather than 
technologies. From [11], 


“First, we must prepare students to select abstractions, not just technologies. 
Second, just as designs can be built on top of higher level abstractions, so 
can courses.” 
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Selecting abstractions requires being able to reason about the properties of those ab- 
stractions. All too often, engineering abstractions are presented as immutable facts 
(“this is how computers work,” or “this differential equation describes that feedback 
circuit”) rather than as human ideas (“this is how VonNeumann proposed that we 
control automatic machines,” or “ignoring the intrinsic randomness and latency in 
this circuit, Black proposed that we could idealize its behavior in this way”). When 
we present these ideas as immutable facts, we are doing it out of a genuine be- 
lief that the methods are useful to engineers. But we are failing to convey that in a 
rapidly changing technological climate, engineers must be prepared to think criti- 
cally about the engineering methods, not just about the engineering designs. When 
we teach modeling, we must also teach meta-modeling, where we discuss the mod- 
eling choices. 


5.4 Conclusion 


Abelson and Sussman describe computer science as “procedural epistemology” 
[1]. Indeed, 20th century computing was about procedure as knowledge. I believe 
that 21st century computing will transform into a system science that subsumes pro- 
cedure, but also embraces concurrency, time, randomness, and physicality. 21st cen- 
tury computing will be an epistemology of concurrent interacting components. And 
the highly valued engineering education will be that which focuses on systems rather 
than on technologies. 
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Summary. I take this opportunity with great pleasure to thank Prof. Pravin Varaiya for his 
guidance over the past fifteen years not only in my academic research but also at Teja Tech- 
nologies, Inc. In this chapter I have outlined some of the emerging themes in system design 
particularly for network equipment. Factors such as the proprietary nature of many of the de- 
velopments, the rapid pace of change in the field, and also the desire to keep out material that 
may appear promotional of commercial interests have required this chapter to be kept at a 
fairly general level. 

My doctoral thesis work with Prof. Varaiya dealt with the modeling, analysis and con- 
trol of hybrid systems—i.e., systems which combined continuous and discrete state dynam- 
ics [1]. Subsequently, as a member of a research team at California PATH Laboratory of the 
UC-Berkeley, directed by Prof. Varaiya, I contributed to the development of a hybrid system 
specification and simulation system called SHIFT [2]. After forming Teja Technologies, Inc., 
where Prof. Varaiya was on the Board of Directors for several years, I continued the core 
work on a commercial basis with emphasis on high-performance execution applied to fast 
path network applications. 


6.1 Architectures 


Improvements in silicon process technology (130nm to 90nm to 65nm) and pro- 
gressively challenging price-performance demands are driving the development of 
new techniques in system design and implementation. The density of gates in silicon 
now easily enables the integration of multiple functions on a single “system-on-a- 
chip,” or of multiple processor cores in a variety of interconnects, or a combination 
of software programmable and hardwired logic elements in application-specific, con- 
figurable architectures. But the availability of such an enabling capability does not by 
itself indicate how systems should be designed. On the other hand, system require- 
ments in terms of throughput and latency performance, power consumption, device 
area and cost continue to get more stringent. Because of the wide range of system 
functions, and their diverse definitions of performance, there is not yet a generalized 
guiding design principle for attaining the price-performance metrics across various 
domains. Over the last few years, several industry projects have been conducted in 
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packet processing and signal processing, and new projects are emerging in server 
processing, that now start to show a pattern that can be used for general system de- 
sign automation. 

In this chapter, we shall consider systems for which throughput performance and 
not latency performance is the primary concern. In case latency is the primary con- 
cern, then the system developer must determine the parallel paths in the application’s 
algorithms and exploit them to the fullest in the implementation. Since these are 
likely to be different from one application to the next, it may be hard to find general 
techniques for designing such systems. As one example of such a general principle, 
latency-minimizing systems are typically implemented to take in one data “sample” 
and process it fully before starting on the next sample. Exploitation of parallelism is 
greatly aided by the high degree of silicon integration since more circuits (= more 
parallel paths) can be implemented, thereby increasing the system performance (and 
also device area). It is our belief that existing approaches in electronic design au- 
tomation (EDA) are largely adequate in this problem domain. 

When latency performance requirement is relaxed, a different approach is opened 
for system design. In this approach, multiple “samples” are processed simultane- 
ously, thereby gaining systemic parallelism independent of the specific application 
algorithms that are run on the samples. Typically, most packet processing applica- 
tions and a large class of signal processing applications are, within limits, relatively 
insensitive to latency performance, and hence benefit from such a treatment. 

Fig. 6.1 uses a simple example to illustrate the throughput performance bene- 
fits of processing multiple packets simultaneously. Fig. 6.1(a) shows a single packet 
processing block, Fig. 6.1(b) shows two blocks operating in parallel and Fig. 6.1(c) 
shows two blocks operating in a pipeline. In all cases, P packets arrive into the sys- 
tem per second. Each packet requires J instructions of processing, considering each 
block to be a processor. (Equivalently, any block could be implemented as hardwired 
logic, with each packet requiring a certain number of logic operations—without loss 
of generality, we will consider the processor case). 

In case 1.a, the single processor must execute P » J instructions per second. In 
case 1.b, each processor receives every other packet for processing. Hence (ignoring 
second-order effects) it needs to run at only half the clock rate of the processor in 
case 1.a. In case 1.c, the first processor executes the first half of the instructions in 
the program and the second processor executes the second half. Since each processor 
executes only half the instructions per packet, again it needs to run at half the clock 
rate of the processor in case 1.a. 

In addition to reducing the processor clock rate, such parallel or pipelined archi- 
tectures provide better memory and I/O latency hiding since stalls in some proces- 
sors can overlap with computation in the other processors. Such latency hiding is 
enhanced when each processor can support multiple hardware contexts. 

While scenarios described above are intentionally simplistic, typical architec- 
tures have several (tens or hundreds) of processor or logic blocks, additional logic 
for arbitration of access to shared resources, internal communication paths between 
blocks, caching and other local data storage schemes, coherency maintenance mech- 
anisms, and specialized hardware logic elements. Despite the complications intro- 
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Fig. 6.1. Packet processing blocks. (a) Single packet processing block, (b) Parallel packet 
processing blocks, (c) Pipelined packet processing blocks 


duced by these factors in both device and system design, the benefits in price and 
performance far outweigh the drawbacks, and the trend is towards increasing prolif- 
eration of such architectures. 

The two architectures used in the example—parallel and pipelined—appear to 
hold the most promise going forward. The parallel architecture seems suited for 
“end-point” systems—i.e., systems that are the ultimate producers or consumers of 
packetized data. The pipeline architecture seems suited for “mid-point” systems— 
i.e., systems that are the forwarders of packetized data. Accordingly, symmetric, 
multi-threaded, multi-core processors are being employed in computer server sys- 
tems, and pipeline architectures are being employed in switches, routers and other 
network equipment. We note that while it is unlikely that pipeline architectures will 
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form the basis of computer server systems, there is no architectural barrier for par- 
allel architectures to be used in switches and routers. Since servers are intended for 
general purpose processing, the parallel architectures will continue to exhibit fea- 
tures that are beneficial across multiple applications. It remains to be seen whether 
standard, off-the-shelf, general purpose pipeline architecture components shall sur- 
vive in the long run. On the other hand, since a given embedded system has a fairly 
specific application, there will be continuing need to develop application-specific 
pipeline architectures. 

At first sight it appears obvious that the parallel architecture must be used sym- 
metrically. Typically a symmetric multiprocessing operating system is run on the 
parallel architecture, and then the various applications are run on the operating sys- 
tem. In some server systems, multiple virtual machines are run on partitions of the 
parallel architecture to provide a reliable environment for running multiple applica- 
tions. But it has been shown convincingly that such symmetric uses of the architec- 
ture do not yield the best overall system performance. Performance is enhanced by 
reserving a portion of the computing resources for offloading the common functions 
of all applications such as, for example, layers 2—4 of packet processing and stored 
data access. The portion of the resources devoted to the offload function needs to 
be easily configurable in order to balance the overall system operation. In effect, the 
parallel architecture is converted into a two-stage pipeline between the offload pro- 
cessing and the application processing stages. Overall system performance gain by 
this technique by a factor of 3-5 has been demonstrated [3]. 

Coming now to pipeline architectures, we first notice that a general purpose 
processor is typically required beside the “fast path” pipeline in order to manage 
the control information used by the pipeline. While the pipeline efficiently executes 
application-specific functions, the control processor(s) run an operating system and 
general applications on top. 

Thus, at some level, there is a unification of the two architectures, with some 
computing resources devoted to an operating system running general purpose appli- 
cations and others reserved for offloading the fast path functions. 


6.2 System Design Automation 


Because of the application-independent nature of the systemic characteristics of 
throughput performance systems, and because of the unified nature of the target ar- 
chitectures, it becomes possible to develop general approaches for system design 
automation. The design flow of such an automation scheme has the following ele- 
ments: 


e specification of the system architecture, 

e specification of the application algorithms and data structures, 

e specification of the mapping of the application to the architecture, 
e generation of the system based on these specifications, 

e inspection of performance results of the generated system, and 
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e iteration over these steps until the desired price-performance tradeoffs are met. 


Without loss of generality, additional layers can be implemented on top of this base- 
line design flow to support higher productivity features such as object-orientation 
and automated iteration using some sort of constrained optimization scheme. 

For pragmatic reasons such as training, reuse and legacy code base, it becomes 
necessary to choose the ANSI C programming language (and its associated pro- 
gramming model) for the various specifications, especially of the algorithms and 
data structures. It is important to keep any new concepts to a minimum, and to pro- 
vide mechanisms that enable these concepts to be applied to a wide range of system 
functions and implementations. We now describe some of the key concepts in these 
specifications. 

For architecture specification, there are the following concepts: architecture, pro- 
cessor, memory, logic block, bus, OS and process. An architecture is a container for 
these elements (including other architectures). Buses are used to connect elements. 
An OS runs on processor(s) and a process runs on an OS. Process, (free) processors 
and logic blocks are targets for mapping algorithms, and memories are targets for 
storing static data as well as managing dynamic data. Each of these elements can 
have a wide range of types and detailed properties based on specific implementa- 
tions. 

The following program fragment illustrates the definition of the Intel IxP2800 
network processor. The purpose of this fragment and others that follow is to give a 
flavor of the specifications using concrete examples. Their detailed explanations are 
beyond the scope of this chapter. All data types and functions beginning with the 
teja- prefix illustrate the architecture specification concepts. 


teja_architecture_t 
create_ixp2800_architecture(teja_architecture_t container, const char *name) 
{ 

teja_architecture_t ixp2800; 

teja_processor t xscale, ue[16]; 

teja_memory_t scratchpad, rbuf, tbuf, shared[16]; 

teja_memory_t local[16], local_shadow[16]; 

teja_bus_t msf_bus, pci_bus, sram_bus, dram_bus; 

teja_bus_t ue_bus[16], slowport_bus; 





int i; 

char buf[256]; 

char buf1[32]; 

ixp2800 = teja_architecture_new(container, name, IXP2800); 


xscale = teja_processor_new(ixp2800, "xscale", XSCALE); 


for(i=0; i<16; i++) { 
sprintf(buf, "“ueti", i); 
sprintf(bufl, "Si", i); 
ue[i] = teja_processor_new(ixp2800, buf, IXPMEV2)j; 
teja_processor_set_property(ue[i], IXPMEV2_UENG_ID, bufl); 
} 


scratchpad = teja_memory new(ixp2800, "scratchpad", IXPSCRATCHPAD) ; 
rbuf = teja_memory_new(ixp2800, "rbuf", IXPRBUF); 


tbuf = teja_memory new(ixp2800, "tbuf", IXPTBUF); 
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for(i=0; i<16; i++) { 
sprintf(buf, "shared%i", i); 
shared[i] = teja_memory_new(ixp2800, buf, IXPSHAREDREG) ; 


sprintf(buf, "“localmem%i", i); 
local[i] = teja_memory_new(ixp2800, buf, IXPLOCALMEM) ; 


pcei_bus = teja_bus_new(ixp2800, "pci_bus", PCI, EXPORTED BUS); 


sram bus = teja_bus_new(ixp2800, "sram bus", SRAM, EXPORTED_BUS); 


dram_bus = teja_bus_new(ixp2800, "dram_bus", DRAM, EXPORTED BUS); 


teja_processor_connect_default(xscale, sram_bus); 


teja_memory_connect_default(scratchpad, sram_bus); 


for(i=0; i<16; i++) { 
teja_memory connect_default(local_shadow[i], ue_bus[i]); 


} 


return ixp2800; 


The use of this architecture in the Intel IXDP2801 board design is shown in the 
following program fragment. 
teja_architecture_t 


create_ixdp2801_ architecture(teja_architecture_t container, 
const char *name) 


{ 
teja_architecture_t ixp2800, ixdp2801; 
teja_bus_t pci_bus, sram_bus, dram_bus; 


ixdp2801 = teja_architecture_new(container, name, IXDP2801); 

ixp2800 = create_ixp2800_architecture(ixdp2801, "ixp2800"); 
teja_architecture_set_property(ixp2800, IXP2800_DRAM _CLOCK_FREQ, "100.0"); 
teja_architecture_connect(ixp2800, "pci_bus", pci_bus); 


teja_architecture_connect(ixp2800, "sram_bus", sram_bus); 
teja_architecture_connect(ixp2800, "dram bus", dram bus); 


The use of this board with the Linux operating system running an application 
process is illustrated by the following program fragment. 


void packet_classifier_config() { 
teja_architecture_t pca, ixdp2801, ixp; 
teja_processor_t ue; 
teja_os_t linux; 
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teja_process t init; 
char buf[20]; 
int i; 


pea = teja_architecture_new(NULL, "pca", USER_DEFINED) ; 
ixdp2801 = create_ixdp2801_architecture(pca, "ixdp2801"); 


for (i=0; i<16; i++) { 
sprintf(buf, "ixdp2801.ixp2800.ue%i", i); 
ue = teja_processor_lookup(pca, buf); 
teja_processor_set_property(ue, IXPMEV2_IN_USE_CTX, "4"); 
} 


linux = teja_os_new(pca, "linux", MVLINUX); 
init = teja_process new(pca, "“init"); 


} 


Algorithms and data structures are specified using the ANSI C language sup- 
plied with two special features (expressed within the C syntax). The first is an API 
for making explicit the coordination of parallel programs. This API, known as the 
late-binding API, provides the abstractions of mutual exclusion, synchronized queue, 
asynchronous communication channel, dynamic memory pools and the waiting for 
events. The use of this API in the application simply indicates the coordination points 
in the distributed logic. The API is bound to an implementation later in the design 
flow in the mapping specification. The second feature is the specification of time, 
event and logic-driven state machine structures within a sequential program by pro- 
viding a pragma that marks statement labels as states, with the corresponding inter- 
pretation that control flows to that labeled statement constitute state transitions of the 
automaton. 

The following program fragment illustrates a packet classifying algorithm speci- 
fied using the late binding API. A fragment of the alternative specification using state 
machine notation is also provided. All the data types and functions beginning with 
the teja- prefix illustrate the late binding API concepts. 


extern teja_channel_t channel; 
extern teja_memorypool _t packet_pool; 


extern struct max_header header_cache; 
extern struct statistics stats; 
extern struct exceptions exc; 


void classifier() 

{ 
int total_pkts = 0; 
int len = 0; 
int *packet = NULL; 


int evt; 
struct packet_descriptor *pd; 


int opt_len; 
while (1) { 


// start to rx_status 
teja_wait(-1, -1, &evt, (void*) &pd, channel); 


if (pd != NULL) { 
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packet = pd->pkt_ptr; 
len = pd->size; 


// rx_status to rx success 
if (packet != NULL) { 
total_pkts += 1; 
teja_memcpy(&header_cache, packet, sizeof(struct max_header) ); 


// rx_success to ip 
if (header_cache.protocol_type == ASSIGNED_IP_NO) { 
opt_len = (header_cache.ver_hdr_len & 0x0f) - 5; 


// ip to tcp 
if (header_cache.protocol == ASSIGNED_TCP_NO) { 
statistics _received(&stats, TCP_TYPE); 
// tcp to start 
teja_memorypool_ put_node(packet_pool, packet); 
} 


// ip to icmp 

else if (header_cache.protocol == ASSIGNED_ICMP_NO) { 
statistics _received(&stats, ICMP_TYPE); 
// icmp to start 
teja_memorypool_put_node(packet_pool, packet); 

} 


// ip to udp 
else if (header_cache.protocol == ASSIGNED_UDP_NO) { 
statistics_received(&stats, UDP_TYPE); 
// udp to start 
teja_memorypool_put_node(packet_pool, packet); 
} 
} 


// rx_success to arp 

else if (header_cache.protocol_type == ASSIGNED_ARP_NO) { 
statistics _received(&stats, ARP_TYPE); 
// arp to start 
teja_memorypool_put_node(packet_pool, packet); 

} 


// rx_success to start 

else { 
exceptions _inc_ drop count(&exc) ; 
teja_memorypool_put_node(packet_pool, packet); 

} 

} // rx _ status to start 
} // start to start - implicit in original 
} 
} 


void alternative _classifier() { 
_Pragma("state") start: 
teja_wait(-1, -1, &evt, (void*) &pd, channel); 
if (pd != NULL) { 
packet = pd->pkt_ptr; 
len = pd->size; 
goto rx_status; 


} 


_Pragma("state") rx_status: 
if (packet != NULL) { 
total_pkts += 1; 
teja_memcpy(&header_cache, packet, sizeof(struct max_header) ); 
// cut through to next pipeline stage 
teja_send(channel, PACKET _EVENT, pd, 
sizeof(struct packet_descriptor), packet_descriptor_pool); 
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goto rx_success; 


} 


The mapping specification combines the application and the architecture into a 
system: functions map to processes, processors or logic blocks, data structures map 
to memories, and the late-binding API maps to the detailed properties of the target 
architecture. 

The following program fragment illustrates the mapping specification. 


void packet_classifier_map() 


{ 
teja_os_map("lin", "pca.ixdp2801.ixp2800.xscale"); 


teja_process map os("init", "lin"); 
teja_run_on_process("setup", "init", INIT); 


teja_run_on_processor("generator", 
"pca.ixdp2801.ixp2800.ue0", 
GENERATOR) ; 


teja_producer_connect(GENERATOR, "channel"); 
teja_consumer_connect(CLASSIFIER, "channel") ; 


teja_memorypool_map_sram_list("packet_descriptor_pool", 
1, 100, sizeof(struct packet_descriptor), 64); 
teja_channel_map next _neighbor ring("channel", 
"signal_based", 





0, 

0, 

"qdrsram"); 
teja_variable_map_memory("arp_packet", "pca.ixdp2801.sram2"); 
teja_variable_map_memory("stats", "pca.ixdp2801.ixp2800.scratchpad"); 


teja_variable_map_memory("header_cache", 
"pea.ixdp2801.ixp2800.localmem1") ; 


teja_init_function("init_packet_classifier", 
"pea.ixdp2801.ixp2800.xscale"); 


The system generation step (implemented, for example, as the Teja C compiler 
tejacc) employs a novel approach compared to traditional compiler and operating 
system tools or to synthesis tools. First tejacc compiles the architecture specifi- 
cation source files as a standard C program and executes it. The effect of this ex- 
ecution is to customize (the running copy of) tejacc for the target architecture. 
Then this customized tejacc parses and analyzes the application source files in 
the context of the target architecture. This provides the compiler with a whole system 
view for validation and optimization, including the analysis and unrolling of the state 
machines and the inlining of functions and other constant optimizations. Finally, te- 
jacc analyzes the mapping specification in preparation for system generation. To 
aid this step, two other platform components are used. One is a library of architec- 
tural infrastructure elements as implementations of the late-binding API provided for 
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processor as well as logic-based parallel and pipelined architectures, and the other is 
a set of target-specific compiler back-ends that support the generation of processor- 
based microcode or logic-based HDL code. Thus, the system generation supports, 
on one hand, a contained set of concepts for architecture and algorithm specification 
and targets a rich set of implementations, on the other hand, through target-specific 
support packages, while attaining high performance in all cases using global system 
optimization techniques. 

The performance feedback on the generated system is keyed to the conceptual 
framework treated as a tightly-coupled distributed processing network. For each 
thread of execution the cycle counts through various called functions and state transi- 
tions as well as the busy/idle duty cycles are reported. For each coordination element 
such as a queue, mutex or channel, the loading in terms of number of stored mes- 
sages or requests and their arrival and departure statistics is reported. A time series 
of these reports can be analyzed using network techniques to discern bottlenecks and 
derive insight into a remapping of the application or a rearchitecting of the system. 


6.3 Results and Future Directions 


We have implemented the system design automation scheme described in this 
chapter and applied it to multiple target applications and architectures with excel- 
lent performance and productivity results. The architectures targeted span parallel 
and pipeline structures of both fixed and configurable types such as parallel multi- 
threaded, multi-core server processors, pipelined multi-threaded, multi-core network 
processors and configurable FPGAs with combined parallel and pipelined architec- 
tures of both software programmable and hardwired elements. The applications span 
layers 2—4 of the OSI network model with some emerging applications all the way 
up to layer 7, with performance levels ranging from 1—10 gigabits per second for 
minimum size Ethernet frames at attractive resource efficiencies. 

In addition to the themes presented in this chapter, research directions include 
increased automation of software-hardware partitioning and software partitioning 
across multiple hardware resources, improvement in hardware logic generation, and 
the integration with test and formal verification platforms. 
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Summary. Varaiya and Walrand found an elegant insight regarding the use of feedback in a 
causal coding context: While generally useful, feedback becomes useless when the channel 
is sufficiently symmetric. The goal of this note is to extend this insight to scenarios inspired 
by sensor networks. Specifically, two such scenarios are considered: a situation with a single 
sensor but where the source is observed through a noisy channel, and a genuine network sce- 
nario where all source and noise distributions are assumed to be Gaussian. For the latter, it is 
shown that feedback is useless if source and channel bandwidth are equal, but that, if the latter 
is larger, feedback is strictly useful. Varaiya and Walrand establish their results via dynamic 
programming arguments. It is unclear to date whether such arguments can be extended to the 
distributed scenario considered in the present chapter. Instead, our results are established via 
information-theoretic bounds. 


7.1 Introduction 


Causal coding and decoding is a desirable feature in a number of sensor network 
applications. Consider for example an earthquake monitoring sensor network in the 
framework of an emergency management system. Clearly, the process of communi- 
cating the sensed data to the central collection unit (i.e., their encoding and decoding) 
should introduce as little delay as possible, calling essentially for causal coding. An- 
other example is the case of sensor networks with very simple (i.e., cheap) sensor 
nodes that may have little or no space for storage. Causal coding permits us to effi- 
ciently take into account such a limitation. 

What is the use of feedback in a causal coding/decoding context? Varaiya and 
Walrand [20, 21] have developed a perspective and the necessary tools for analyzing 
causal encoding/decoding in a point-to-point communication system. One of their 
key results is that feedback is strictly useful in general causal coding problems, but 
that it becomes useless if the channel is sufficiently symmetric. This is by contrast to 
the case of arbitrary (not necessarily causal) encoding and decoding, unconstrained 
both in delay and in complexity, where it is well known that for discrete memory- 
less communication channels, feedback does not permit an increase in capacity [4, 
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p. 212]. (It can, however, improve the reliability, i.e., enhance the decay of the error 
probability as a function of the duration of the transmission [3].) 

In this chapter, we extend the results of [20, 21] in two directions, both inspired 
by sensor network problems. First, we consider the scenario where the source is not 
observed directly by the encoder, but rather through a noisy channel, modeling the 
noise introduced by the sensing device. As we show, the same dynamic programming 
arguments as in [20, 21] can be used to establish the usefulness of feedback. 

Then, we study the case of multiple sensors that have to process their respective 
observed data in a distributed fashion before communicating it to a central data col- 
lection point. It is unclear to date whether dynamic programming arguments can be 
extended to the distributed scenario considered in this chapter. Instead, we restrict 
attention to one particular Gaussian sensor network model and use a different set 
of tools. In particular, using information-theoretic arguments, it is again possible to 
make claims about the usefulness of feedback. 

There are two main contributions for the considered Gaussian sensor network 
scenario. First, it is shown that when source and channel bandwidth are equal, feed- 
back is useless in a causal coding context. Second, it is shown that when the channel 
bandwidth exceeds the source bandwidth, feedback is strictly useful. 

These results, in some sense, confirm the basic intuition furnished by Varaiya 
and Walrand in [20, 21]: when the communication channel has an appropriate sym- 
metry, feedback is useless. Specifically, in our example, when source and chan- 
nel bandwidth are equal, the resulting overall situation can be seen as “sufficiently 
symmetric” for feedback to become useless. Conversely, in the absence of symme- 
try, [20, 21] provides examples where feedback is strictly useful in a causal coding 
context. Again, this is reflected by our findings in the sense that when source and 
channel bandwidths differ, feedback enhances performance. 

The rest of the chapter is organized as follows. In Section 7.2, the remote source- 
channel communication problem is considered. After setting up the problem and dis- 
cussing known information-theoretic bounds, we investigate the problem of causal 
encoding and decoding. The optimum schemes can be characterized using the meth- 
ods and arguments of [20, 21]. Related work has been presented in [1]. 

In Section 7.3, we provide detailed definitions and assumptions for the specific 
Gaussian sensor network investigated in the present chapter. 

While the arguments of [20, 21] can be extended easily to the remote source- 
channel coding problem (as we show in Section 7.2.4), there does not seem to be an 
extension to the distributed processing problem (as defined in Section 7.3). Using a 
different set of tools, in Sections 7.4 and 7.5, we present the two main results of the 
chapter, illustrating that feedback is useless in one case and useful in the other. 


7.2 The Remote Source-Channel Communication Problem 


7.2.1 Notations and conventions 


In this chapter, we will denote random variables by upper case letters, such as X, 
and their realizations by lower case letters, such as x. The alphabet (or range) of x is 
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Fig. 7.1. The remote source-channel communication scenario: A source (SRC) is observed 
through a noisy process (OBS) by an encoder (ENC) whose task is to produce a signal to 
be transmitted across the channel (CHAN) to a decoder (DEC). The decoder must provide 
the destination (DEST) with the best possible estimate of the original source. Moreover, the 
encoder may also know the channel output signal, at a delay (D) of one time unit. 


denoted by V, and the probability mass function (or probability density function if 
X is continuous-valued) will be denoted by 


px(@). (7.1) 


The expectation operator will be denoted by E[-]. 
Sequences of random variables will be denoted by {X [i] }"_,, where 7 is thought 
of as the (discrete) time index. Occasionally, we will use the shorthand X” = 


{Xi 


7.2.2 Model definition and problem statement 


In this section, we study the communication system illustrated in Fig. 7.1. It is 
almost the standard source-channel communication system, except that the encoder 
does not get to observe the source directly. Rather, it merely gets a noisy version of 
the source outputs. This could be termed remote source observation, and we therefore 
refer to the resulting communication problem as the remote source-channel commu- 
nication scenario. Our consideration is motivated by the fact that the sensor reading 
is rarely the quantity that one is interested in. Rather, the sensor network is typically 
expected to reveal an underlying state of nature, which often can be observed only 
partially and subject to measurement noise in the sensing device. 

More precisely, for the scope of this chapter, we consider a (discrete-time) sta- 
tionary ergodic source characterized by a sequence of random variables {S[#]}i>1, 
where 7 denotes the (discrete) time index. The probability mass (or density) function 
of the sequence of random variables is assumed to be fixed and known. The source 
outputs are observed through a noisy observation process by the encoder, charac- 
terized in a probabilistic manner! by a conditional distribution p(u|s). We consider 
block codes of length n. That is, upon observing a sequence {U [i] }"_, = {uli] y1, 
the encoder must produce a codeword, i.e., a sequence of n channel input symbols, 


{x[i]} i = aluli] j=). (7.2) 


1 For simplicity, our considerations are restricted to the case where the noisy channel through 
which the source is observed is memoryless. 
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More generally, as illustrated in Fig. 7.1, we also consider the case of (perfect) feed- 
back. That is, beyond (7.2), the channel input signal may also depend on past channel 
outputs, as follows: 


sli] = faalala Lyle] Has). (7.3) 


In our communication model, there is a price or cost associated with each pair of 
channel input and output symbols, measured by a channel cos?’ function, 


p: £ x Y> Ry, (7.4) 


where we use R, to denote the non-negative real numbers. In terms of this cost, 
the coding function of Equation (7.2) must be selected in such a way as to satisfy a 
constraint of the form 


EYO E XY] < Pa- (7.5) 


A usual choice of a channel cost function is p(x) = |x|, permitting us to associate 
the expected cost with the power consumption of the communication system. 

The channel is assumed to be memoryless and characterized by a conditional 
distribution py; x (y|x), that is, 


plyln]leln], y") = py)x(yln]|e[n)). (7.6) 


Based on the observed channel output sequence y”, the decoder produces an 
estimate {.S[7]}"_, of the underlying source sequence, according to a mapping 


tli] bia = guli Hea). (7.7) 


The overall coding/decoding effort is then assessed by the distortion between the 
actual source output sequence, and its corresponding reconstruction at the decoder. 
To this end, the model specifies a distortion measure 


di SKS AR. (7.8) 


The resulting distortion is assessed in terms of its average, 
Da = +Y E [asta St] 79) 
n= ORSAK i 
n A 


where the expectation is evaluated over the joint distribution of the pair of sequences 
ASA (Slil). 


? This terminology, while customary in the communication theory literature, see e.g. [6], is 
unfortunately in slight contradiction with [20, 21], where the term “cost” denotes what we 
will call “distortion” below. Therefore, we will be explicit and call p the “channel cost 
function” throughout. 
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The success of a particular block coding scheme (of block length n) is measured 
entirely in terms of the cost-distortion pair (P,,, Dn). Optimal pairs are then optimal 
trade-offs between cost and distortion. More specifically, a pair (P,,, Dn) represents 
an optimal system if at fixed cost P,,, there is no other code with distortion Dn < Diy 
and at fixed distortion D,,, there is no other code with cost P, < Pa. In this chapter, 
we are interested in the optimum performance irrespective of the complexity of the 
encoding and decoding, in the limit as n — oo. We consider in parallel two settings: 


(a) The case where the encoding and decoding mappings are entirely unconstrained. 
(b) The case where only causal encoding and decoding is allowed (as defined in [20, 
21)). 


7.2.3 Information-theoretic bounds 


In this subsection, we suppose that arbitrary encoding and decoding is allowed, 
in the sense that the encoder in Eq. (7.2) and the decoder in Eq. (7.7) may be arbi- 
trary mappings. The information-theoretic approach to this problem, developed by 
Shannon [19], is to determine the ultimate limits on (Pp, Dn), as n — oo. This is 
sometimes referred to as the OPTA (optimum performance theoretically attainable), 
see [2, p.156]. 

For the remote source-channel communication scenario, the problem formulation 
and the corresponding solution have been found in [7]. An extended account can be 
found in [2, p.78, p.124ff.]. For the purposes of this chapter, we need the following 
result: 


Theorem 1. The optimum trade-off between cost and distortion (OPTA) for the re- 
mote source-channel communication scenario with a memoryless channel is charac- 
terized by 


Rremote(D) = C(P), (7.10) 


where Ryemote(D) denote the remote rate-distortion function for the source, and 
C(P) the Shannon capacity of the communication channel. This is true irrespective 
of whether feedback from the channel output to the encoder is available. 


The proof of this theorem is briefly outlined in Appendix A. 


7.2.4 Causal coding for a first-order Markov source 


In this section, we derive the analogues of the results of [21] for the communi- 
cation scenario illustrated in Fig. 7.1 and defined in Section 7.2.2. That is, we now 
restrict the coding scheme to be causal: The encoder map in Eq. (7.3) must satisfy 


ali] = fn ({uli a1 WKE, (7.11) 


i.e., the ith component of the channel input may only be a function of {s[j]}‘_1. 
Similarly, the decoder map in Eq. (7.7) must satisfy 
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ŝli] = gna luli] É). (1.12) 


This setup can be illustrated using a simple extension of the instructive example 
in [21, Sec. I.B], permitting us to conclude that feedback is strictly useful in a 
causal coding context, in some cases. 

As in [21, Sec. III.B], this example permits us to conclude that in a causal 
encoding-decoding context, feedback is strictly useful, by contrast to the case of 
arbitrary (hence, not necessarily causal) encoding-decoding (Theorem 1). 

By analogy to [21, Thm. 1], we obtain the following proposition: 


Proposition 1. There is an optimal causal code f* of the form 


zli] = få lulil, (ul HE). (7.13) 


This proposition can be proved along the lines of [21, Thm. 1]. 


7.2.5 Summary — two notions of optimality 


In this section, we provided the basic problem setup of this chapter, namely, 
the remote source-channel communication problem. Two optimality criteria were 
discussed: On the one hand, information-theoretic optimality, where the encoding 
and decoding can be done with arbitrary complexity and delay. On the other hand, 
we studied optimal causal encoding and decoding schemes. 

It is interesting to point out that in certain special cases, these two notions coin- 
cide in the sense that, even with arbitrary complexity and delay, it is not feasible to 
obtain a performance superior to that of the best causal coding scheme. Such cases 
can be identified along the lines of [12]. 


7.3 The Considered Gaussian Sensor Network Scenario 


In this chapter, we study a particular sensor network model under which a Gaus- 
sian source is observed in noise by M sensors. These sensors communicate over an 
additive white Gaussian multiple-access channel. It is customary in the communica- 
tion theory literature to consider a constraint on the power transmitted by each node, 
reflecting the limited battery power at the nodes. However, in our sensor network 
model, the signals transmitted by the sensor nodes will typically be correlated. For 
such cases, regulatory bodies (such as the United States Federal Communications 
Commission (FCC)) are more likely to impose power constraints on the system as a 
whole.> As shown elsewhere [8, 9], such a change in perspective leads to significantly 
different insights and conclusions in a communication network context. 

The sensor network model studied in this chapter is illustrated in Figs. 7.2 and 7.3 
and described in the sequel. 


3 Such a perspective is suggested by the current formulation of the Code of Federal Reg- 
ulations (CFR) Title 47, Part 15, Section 15.31(h). See http://www.gpoaccess.gov/cfr/ 
index.html 
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Fig. 7.2. The additive white Gaussian multiple-access channel considered in this chapter. 


The communication channel in our sensor network is modeled by the standard 
additive white Gaussian multiple-access channel as defined e.g. in [4, Sec.14.1.2] 
and shown in Fig. 7.2. The signals transmitted by the M sensor nodes are complex- 
valued sequences {Xm[i] }i>1, form = 1,2,..., M, selected appropriately by each 
node. The destination receives the signal 


Yj = Zi] + X Xn, (7.14) 


where {Z[i]};>1 is a sequence of independent and identically distributed (i.i.d.) cir- 
cularly complex Gaussian random variables of mean zero and variance 77. 

In the point-to-point case discussed in Section 7.2, the coding schemes had to 
satisfy channel input cost constraints of the form (7.4)-(7.5). Similarly, here, we 
also consider the communication channel of Fig. 7.2 subject to constraints. Usually 
(see, e.g., [4, Sec.14.1.2]), the power transmitted by the nodes is constrained: 


1 n 
ZDE [Xml] < Pr (7.15) 
oi 
form = 1,2,..., M. This reflects well the physical limitations of transmitting nodes 
(finite power etc.). However, when the sequences { X,,[7]}"_,, form = 1,2,...,M, 


are not selected independently of each other, we argue that it becomes more mean- 
ingful to consider a constraint on the received power, that is, 


2 


1 n M 
a. 2 bmXmli 20. (7.16) 
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Fig. 7.3. The source structure in the sensor network considered in this chapter: The source 
information Um at node m is a noisy version of a common underlying source S, and the 
overall goal is to reconstruct S to within mean-squared error distortion. 


The results provided in this chapter concern the latter case.4 


The source structure for the sensor network considered in this chapter is illus- 
trated in Fig. 7.3. The underlying data of interest is denoted by S” = {Sļi] y4, 
which we assume to be a sequence of i.i.d. circularly symmetric Gaussian random 
variables of mean zero and variance c. 

Sensor m observes a sequence U? = {Ui Die , which depends on the source 
according to a conditional distribution p(u7, uy,...,uh,|s”). For the scope of this 
chapter, we assume this to be given by 


Um{i] = am Sli] + Wmlil, (7.17) 


where Wm [i] are i.i.d. random variables (over i and m) of mean zero and variance 


2 
Ow: 


7.4 Single Channel Use: Feedback Is Useless 


Let us now analyze the optimum performance in the sensor network defined in 
Section 7.3. In order to gain intuition for this problem, recall the following three key 
facts: 


1. Consider the case M = 1, and neglect the observation noises (that is, set Tiy = 
0). Then, following [14], it is easy to find the optimum causal code: it turns out 
that multiplying U; by an appropriate constant to meet the power constraint is, 
in fact, the optimum code in an information-theoretic sense, hence, a fortiori, 
the optimum causal code. 


4 For the case of transmit power constraints of the form of Equation (7.15), slightly weaker 
results were established in [10]. 


7 Causal Coding and Feedback in Gaussian Sensor Networks 99 


2. Continuing to neglect the noise in the source observation process (i.e., 771, = 0), 
the information-theoretically optimum communication strategy is to make all 
signals equal, potentially up to scaling (to account for the fact that the coeffi- 
cients b,,, are different). This is sometimes referred to as beam-forming. 

3. Neglecting now the noise on the channel, i.e., 77 = 0, it is also well-known that 
the minimum mean-squared error (MMSE) estimate of S based on the noisy 
observations U;,..., Ua, is simply a suitable linear combination of the latter. 
Note that the considered multiple-access channel precisely permits us to form 
such a linear combination. 


These three insights suggest that a good causal strategy might be for each sensor to 
apply an appropriate scaling factor to its respective observation Um, and to transmit 
this without further coding. Clearly, however, the three insights above are not suffi- 
cient to establish that it is the best causal code. In particular, it is unclear whether 
this will indeed be the best strategy to simultaneously suppress the measurement and 
the communication noises. The following theorem asserts that this is nevertheless the 
case: 


Theorem 2. For the source structure defined in this section and the additive white 
Gaussian MAC under a received-power constraint, the optimum performance for 
causal coding is 





2 2 
1 
D= ppi 5, (7.18) 
1448 14 ch 148 
ow oS Zz 


M ‘at : f 
where A = Ð`, ,—; |am|*. Moreover, this is the optimum performance in the sense 


that no other coding scheme, causal or not, can perform better. 


The proof of this theorem is given in Appendix B. As discussed above, the op- 
timum causal coding scheme is simple. To establish optimality, in this case, we are 
able to show that the performance according to Theorem 2 is, in fact, the information- 
theoretic optimum, hence, cannot be improved upon by removing the causality con- 
straint. 

The second question is whether feedback can improve on this performance. 
While the capacity of memoryless channels cannot be increased by feedback, it is 
known that in memoryless networks, feedback generally does increase capacity. In 
particular, this is true for the additive white Gaussian multiple-access channel (under 
a transmit-power constraint), see [17]. For the same channel, but under a received- 
power constraint, it can be shown that feedback cannot increase capacity [9]. For 
the considered sensor network, it follows directly from the proof of Theorem 2 that 
feedback does not permit us to enhance performance. That is, 


Corollary 1. Feedback is useless. 
The proof argument is given at the end of the converse proof in Appendix B. 


Remark 1. Notice that we assume that the coefficients bm, m = 1,2,...,M, in 
Fig. 7.2 are assumed to be deterministic (nonrandom) constants. 
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7.5 Two Channel Uses: Feedback Is Useful 


The preceding section concluded that feedback is useless for the considered sen- 
sor network situation. In this section, we consider a slight variation on the original 
setup, and show that this leads to a different conclusion. 

In particular, the variation is that the channel can be used twice for each source 
sample. Equivalently, this can be understood as the scenario where the (temporal) 
bandwidth of the source is half the bandwidth of the communication channel. In 
comparison to Section 7.4, there is additional communication capability in the setup 
of this section, and the question becomes that of optimally exploiting this benefit. 

In the absence of feedback, one way of exploiting the potential of two channel 
uses per source sample in a causal coding/decoding context is to simply repeat the 
source simple twice. It is easy to verify that for such repetition coding, the resulting 
distortion is found to beñ 











2 2 
D=— $; E (7.19) 
Lia ee + ot 


where A = > |a|?. The general question of whether this can be improved 
upon (in the absence of feedback) appears to be unanswered, even if the coding and 
decoding are allowed to incur arbitrary delay and be arbitrarily complex. 

Instead, we now consider the scenario where causal, noiseless feedback is avail- 
able. For the standard point-to-point communication problem, simple feedback strate- 
gies have been suggested that permit us to exploit additional channel bandwidth [5, 
11, 15, 18]. The strategy is to send, in the second channel use, merely the innovation 
between the source value that needs to be communicated, and the receiver’s current 
estimate. In extension of this work, the strategy can also be shown to perform op- 
timally in the point-to-point version of the remote source-channel communication 
scenario, as studied in Section 7.2. 

Combining this insight with that of Theorem 2, it is perhaps not surprising that 
such a “send the innovations only” strategy also works in a distributed setting such as 
the one studied in this chapter. Somewhat less immediate is the insight that the strat- 
egy, in fact, achieves the information-theoretic optimum, as asserted in the following 
theorem. 


Theorem 3. When noiseless, causal feedback is available, the optimum performance 
for causal coding is 


2 





2 2 
1 
D=— 5$ 2 |: (7.20) 
1+¢38 14% \l+a 
ow os Z 


5 Tt is interesting to note that while suboptimal in general, repetition coding by far outper- 
forms a strategy in which the source is compressed optimally in a distributed fashion, and 
the resulting compression indices are communicated across the multiple-access channel 
using capacity-achieving codes. See [13] for further detail. 
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M a : f 
where A = Ð`, ,—; |am|*. Moreover, this is the optimum performance in the sense 


that no other coding scheme, causal or not, can perform better. 


The proof of this theorem is given in Appendix C. 

For the main conclusion of this chapter, namely, the role of feedback in a causal 
coding/decoding context, the remaining task is to show that the performance of The- 
orem 3 cannot be achieved without feedback. 

This is difficult because the information-theoretically optimum performance 
without feedback is not known to date: While the information-theoretic source cod- 
ing problem for the source illustrated in Fig. 7.3 is solved [16], and the capacity of 
the additive white Gaussian multiple-access channel is known, there is no separation 
theorem for the overall communication system: it is suboptimal to combine these two 
codes. 

Using a different approach, we can establish the following proposition, proved in 
Appendix D. 


Proposition 2. For the sensor network considered in this chapter, when there are two 
channel uses per source sample, the smallest distortion achievable using a causal en- 
coder/decoder pair without feedback is strictly larger than the distortion achievable 
with feedback, given in Theorem 3. 


7.6 Extensions 


7.6.1 Collaboration between the sensors 


While we have focused in this chapter on the simple communication structure 
illustrated in Fig. 7.2, it is easy to see that the results apply unchanged to the case 
where arbitrary (constrained or unconstrained) cooperation between the sensor nodes 
is available. This is addressed in part in [13]. 


7.6.2 Arbitrary ratio between source symbols and channel uses 


The results and insights presented in Section 7.5 can be extended to the case 
where for each k source samples, n channel uses are available, where k < n. In 
extension of the analysis in [5, 15, 18], the coding scheme is modified to transmit 
suitable linear combinations of the k source samples that are jointly processed. In- 
novations are then furnished just like in the proof of Theorem 3. 

It is interesting to note that in the converse case, when k > n, it is unclear what 
a successful strategy will look like, even in the simple point-to-point case studied 
in [5; 15, 18]. 


7.6.3 Noisy feedback 


Another relevant extension concerns noisy feedback. From an information-theoretic 
perspective, even in the point-to-point case, noisy feedback has not been widely stud- 
ied. The reason is simply that for memoryless channels, feedback does not permit us 
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to boost the performance in the first place. It does, of course, enable simple schemes 
to attain optimum performance, but when the feedback becomes noisy, these schemes 
usually break down, and it becomes more attractive to simply disregard the feedback. 

The situation is fundamentally different for the sensor network considered in this 
chapter since the optimum performance without feedback is unknown, and in fact, it 
is unclear what coding scheme to use, other than repetition coding. Hence, feedback, 
even if noisy, can be exploited to improve the performance over repetition coding. 

The performance of the scheme used to establish Theorem 3 under noisy feed- 
back can be analyzed accordingly. In particular, suppose that instead of Y; [i], a signal 
Y,[t]+V [i] is fed back, where {V [i] } is a sequence of i.i.d. circularly complex Gaus- 
sian random variables of mean zero and variance o?,. Clearly, it is a simple matter to 
evaluate the effect of this degradation on the scheme of Theorem 3. 


7.7 Conclusions 


This chapter further investigates the significance of feedback in a causal coding 
context. It has been shown by Varaiya and Walrand [20, 21] that while feedback 
is useless if the channel structure is sufficiently symmetric, it can be strictly useful 
when such symmetry is absent. 

This chapter extends their study to a simple Gaussian sensor network case, and 
reveals a similar behavior: In the symmetric case (which, in our setup, is the case 
where source and channel bandwidth are equal), feedback turns out to be useless, 
whereas in the asymmetric case (when the channel bandwidth exceeds the source 
bandwidth), feedback is strictly useful. 

The elegant dynamic programming approach of [20, 21] appears to be hard to 
extend to a distributed setting such as the one considered here. Therefore, we estab- 
lished our results using a different set of tools. 


Acknowledgment. Inspiring discussions with Prof. J. Walrand and Prof. M. Vetterli 
are gratefully acknowledged. The work in this chapter was supported in part by the 
National Science Foundation under award CCF-0347298 (CAREER). 


A The Remote Source-Channel Communication Problem 


In extension of the standard point-to-point joint source-channel communication 
problem, the key to analyze the communication scenario of Fig. 7.1 turns out to be 
the remote rate-distortion function [2, 6, 7]: 


Definition 1. Define the remote rate-distortion function for a source S, characterized 
by a probability distribution p(s), an observation process p(u|s), and a distortion 
measure d(s, 8) as follows: 


Rremote(D) = min I(U; ô) (7.21) 


7 Causal Coding and Feedback in Gaussian Sensor Networks 103 


where the minimization is performed over all conditional distributions p(8|u) for 


which Eld(S,)] < D. 


With this, we now prove Theorem 1. 

Proof of Theorem 1. Converse: Consider any source-channel code of block length 
n, achieving a distortion D. This code induces a distribution p(§”|u”). It can be 
shown that for this distribution, 


I(U"; $") > NRremote(D). (7.22) 
By the data processing inequality, 
Us") < I(U”; Y”). (7.23) 
Furthermore, we can expand 
I(U”; Y”) = H(Y”)— A(Y™|U") < A(Y") — H(Y”| X”, U”), (7.24) 


and by the fact that the channel is memoryless, 


A(Y™|X",U") = SAURA. (7.25) 
i=1 
Finally, since 
HY) < OH), 12%) 
i=1 
we conclude that 
MRremote(D) < nC. (7.27) 


Achievability follows straightforwardly from the operational significance of the re- 
mote rate-distortion function and of capacity. 


B Proof of Theorem 2 


Converse: To obtain an upper bound to the best possible performance, we deter- 
mine the performance for the “idealized” scenario where all the encoders in Figs. 7.2 
and 7.3 are merged into one single device, turning the overall network effectively into 
a (memoryless) point-to-point (remote) source-channel communication system. For 
the latter, the optimum performance is described by Theorem 1. In order to evaluate 
the theorem, we have to determine the remote rate-distortion function for the sce- 
nario of Fig. 7.3. Collect all the noisy observation into the vector U = (U;,..., Um). 
Then, we can write 
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Dremote(R) = min E|S — S|? (7.28) 


where the minimum is over all conditional distributions p(4|w) for which I(U; ĝ) < 
R. Rewriting this, we find 


Dremote( R) = ma,  Eļ|S-— f(U) + f(U) - $|? 
plêļu):I(U;S)<R 
< min  Eļ|S- fU)? + Elf(U)- 8}, 
p(8|u):1(U;8)<R 
with equality if and only if 
E(S — f(U))(f(U) — 8)? =0. (7.29) 


Let us choose f(U) to be the minimum mean-squared error (MMSE) estimate of 
S based on U. By the orthogonality principle, Condition (7.29) is satisfied, and the 
remaining minimization problem is 


min | Fif(u) — se, 
p(s|u):1(U;S)<R 


which can be evaluated by standard arguments to yield Var(f(U))2~”. Combining 
terms, we find 





O30 OR) 29-R 
Dremote(R) = 1+ jam] 257]. (7.30) 
Prai [am|?03 + oF aiw 2 


The rate available to communicate in the “idealized” scenario is easily determined to 
be 


C = logs (1 + 5) : (7.31) 
Z 
Combining (7.30) with (7.31) yields the claimed formula. 

For Corollary 1, simply note that feedback clearly leaves the remote rate- 
distortion function, Equation (7.30), unchanged, but it also does not affect the ca- 
pacity of the additive white Gaussian (vector) channel in the “idealized” scenario, 
Equation (7.31), see e.g. [4, p.256]. 

Achievability: Let us analyze the following coding scheme: During time slot 2, 
sensor m transmits 


_ om Q 


by M 7 M 
m o3 oa lam 2) + Ow ae lam|? 
i ee” 


=a 








Umli], (7.32) 


where až, denotes the complex conjugate of am. It is easily verified that this satisfies 
the imposed constraint on the received sequence {Y [i]}, which takes the shape 
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M 
Yli] =a X af, Uml + Z[i)- (7.33) 
m=1 
Note that this implies that the joint distribution of S[i] and Y[#] is a multivariate 
Gaussian distribution. To simplify notation, let us drop the time indices and merely 
write S and Y instead of S[i] and Y [i], respectively. The distortion incurred by esti- 
mating S from Y is well known to be determined by the formula 











E[|SY*] E[YS*] 
D=E||S\? i (7.34) 
Is") E[|Y1?] 
where, from the model definition, 
E [1S1] = 0%, (7.35) 
and from the received-power constraint, 
E ||Y|?] = Q +03. (7.36) 
Furthermore, the term E [SY *] can be evaluated as follows: 
M 
E[SY*] =a X` amE [SU}] 
M 
=a 5 aman E [S S*] 
M 
=a X |am|?o3. (1.37) 
Noting that E [Y S*] = (E [SY *])*, the distortion can be evaluated as follows: 
2 
M 
2 o$ is lam|?) Q 
B= OS M a M Q+? 
o% DA lam?) F Ow fa |am|? A 
M 
2 o$ Smi lam}? Q 





OW + 03 Dimas lam? Q +03 


= oSCw ae os = 2 o R 
— M 1 5 [am] 1 5) 
Ot oF o3 pam am a ow m=1 Q + °% 


M 
_ ogoi Ta os X la [7 o3 
~ 3 2 M 2 m 3 
Ow TOs oe am ow m=1 Q + Iz 
2 2 
Osow 


5 M 
o 
= 1+ 8 lanl?) , (7.38) 
og + 03 me am|? or, +3) 2 
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which is the claimed formula. 
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C Proof of Theorem 3 


Converse: Let us proceed exactly along the lines of the proof of Theorem 2, pre- 
sented in Appendix B. The only change occurs in the rate available to communicate, 
which simply is 


C = 2logs (1 + 2) : (7.39) 
z 


Combining (7.30) with (7.39) yields the claimed formula. 
Achievability: There are now, for each source sample, two channel uses available. 
In the first channel use, we have 


Q M 
Yı [i] = XO a*,Umli] + Zli]. (7.40) 


M me 
o3 a [am]? y + Cw Da 1 [am]? m=1 


=Q 











In the second channel use, the transmitters select their signals in such a way as 
to make the received signal 


M 
=p (£ až, Umi] - mm) + Zoli], (7.41) 


where y is selected such as to ensure 
EY} [i] Y2[¢]] = 0. (7.42) 


Eden, y is selected such that yY; [i] is the minimum mean-squared error esti- 
mate of y 1 Qn Um [i] based on Y; [i], that is 


2 [(eMas abt) r] 
T E |Y? | 
EE (E an) (Til an) 
EY 
SE |Een] p 
= AR l 


which can be expressed as y = a Hence, 
Z 


= ($ tal ee HO: I Uns i+ zal) ) + al 


= Q l . 
= (32? D až Umi] — ae) + Zəli]. (7.44) 


m=1 
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It remains to determine /3, which follows from ensuring that Æ hau P| < Q. That 


is, 


42 2 o VR 1 Q \? 
E bag = |6| (az) oP $ af (z) a) +03, (1.45) 


|B? _ Q4o% 


xyz 
lal? T3 





which yields => 


Finally, we have to evaluate the distortion incurred when estimating S[i] from 
Yıļi] and Y3[7]. Again, from standard results about multivariate Gaussians, the dis- 
tortion can be expressed as 


b= oi (swi sYa) (sta ami) (Bhs) 
P [p] EIA 


since our code construction ensures Equation (7.42). 
Again, E [Y2 PI = Q by construction, and the last term to evaluate is 


g M * 
S (È ya aa | 


=P 5%, (Eer E[SU> li a) 


m=1 


M 
gx o3 2 2 
=, J+ (Ein z) : (7.47) 


=D (7.46) 





E[SY3] = E 














Finally, 





E[SYJ EYS] _ _|lPo4 
BaP QFP a (È len o) ot (1.48) 


m=1 


Combining terms yields the claimed formula. 


D Proof of Proposition 2 


The main point of this proposition can be established by noting that under a 
causality constraint both at the encoder and at the decoder, an equivalent problem 
statement is the following: A Gaussian random variable U can be arbitrarily trans- 
formed into the pair X; = fı(U), X2 = f2(U), for two arbitrary functions f;(-) 
and f2(-), respectively, satisfying E[|f,(U)|?] < P and E[|f2(U)|?] < P. Based 
on Yı = Xı + Zı and Yo = Xə + Zə, the goal is to estimate U. The resulting 
mean-squared error is 
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BU — B(U|%,,¥al] = E [|U — E UIAU) + Z1, (U) + Za)”]- 049) 


The functions fı(-) and f2(-) must be selected so as to minimize (7.49). 

To bound this distortion, we may use the following lemma (which is a rather 
direct consequence of the maximum entropy theorem, stating that h(U — Ô ) < 
log(2reVar(U — U)): 


Lemma 1. /f U is Gaussian with mean zero and variance oe. then for any U 4 U, 





r a2 
2.210) < p |e =U | l (1.50) 
Furthermore, using g(-) to denote the estimator function (i.e., the conditional mean 
of U given fi(U) + Z and fo(U) + Z2), 


1(U;0) = I(U; 9(fi(U) + Z1, fo(U) + Z2)) 
< I(U; fi(U) + Z1, fo(U) + 22) (7.51) 


by the data processing inequality. Hence, consider the problem 
max I(U; fi(U) + Zi, fo(U) + Z2), (7.52) 


where the max is taken over all functions f; (-) and f2(-) satisfying E[| f1(U)|?] < P 
and E||f2(U)|?] < P. 

The distortion achieved in Theorem 3 corresponds to the case when fı(U) + Z1 
and f2(U)+Z, are independent and Gaussian, which uniquely maximizes the mutual 
information in (7.52), and achieves Lemma 1 with equality. However, it follows from 
standard arguments about multivariate Gaussian distributions that there do not exist 
(deterministic) functions f;(-) and f2(-) for which f;(U) and f2(U) are independent 
Gaussian random variables. Hence, the resulting mutual information [(U; fı(U) + 
Z1, fo(U) + Z2) must be strictly smaller, and thus, following Lemma 1, the resulting 
distortion must be strictly larger than the distortion of Theorem 3. ° 
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Summary. We consider a set of networked controllers where multiple control systems coexist 
with their control loops closed over a shared wireless network that induces random delays and 
packet losses. This system requires a joint design of the wireless network and the controllers, 
where the design objective is to optimize the control performance. This performance is a com- 
plex function of the controller design and the network parameters, such as throughput, packet 
delay and packet loss probability. Random delays and packet losses in the feedback loop im- 
pose new challenges on the optimal controller design. We first investigate controller design 
with randomly dropped packets. We prove the separation of estimation and control under cer- 
tain assumptions of the network and show that the Kalman filter can be modified to generate 
the optimal state estimate when part or all of the observation is lost. The wireless network 
needs to provide a sufficient throughput for each of the sensor measurements in order to guar- 
antee the stability of the Kalman filter. We then focus on the wireless network design for this 
controller. The goal of optimizing the control performance imposes implicit tradeoffs on the 
wireless network design as opposed to the explicit tradeoffs typical in wireless data and voice 
applications. Specifically, the tradeoffs between network throughput, time delay and packet 
loss probability are intricate and implicit in the control performance index, which complicates 
network optimization. We show that this optimization requires a cross-layer design frame- 
work, and propose such a framework for a broad class of networked control applications. We 
then illustrate this framework by a cross-layer optimization of the link layer, MAC layer, and 
sample period selection in a double inverted pendulum system. 


8.1 Introduction 


Distributed control over wireless networks has many compelling applications, in- 
cluding automated highway systems [1], automated factories, and smart homes and 
appliances. The deployment of wireless networks enables new control applications 
and allows fully mobile operation and flexible installation, while reducing mainte- 
nance costs. Building a distributed control system supported by a wireless network 


* This work is supported by ABB through the Stanford Networking Research Center and by 
NSF under grant 0120912. 
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is a challenging task that requires a new design approach to both systems. Many of 
the design challenges are similar to those in control over the Internet. 

Control systems and communication networks are typically designed using very 
different principles. Traditional control theory requires the feedback data to be ac- 
curate, timely and lossless. Conversely, random delay and packet loss are generally 
accepted in communication network design and, indeed, very hard to avoid. This de- 
lay and loss is much more pronounced in wireless networks than in wired networks 
due to limited spectrum and power, time-varying channel gains, and interference. 

Joint design of control and communication is two-fold: the controller design 
needs to be robust and adaptive to the communication faults such as random delays 
and packet losses, while the network should be designed with the goal of optimiz- 
ing the end-to-end control performance. Furthermore, there is a tradeoff between 
communication and controller performance. From the control perspective, the more 
knowledge the controller has about the system, the better the control performance 
is. Additional knowledge about the system is obtained by increasing the number of 
sensors or sending sensor measurements more frequently. However, this increases 
the communication burden on the network and the network may become congested. 
The congestion results in longer delays and more packet losses, which degrade the 
control performance. Therefore, a joint design of the network and the controller is 
necessary. Joint design of control and communication has received little attention 
due to its inherent challenges and interdisciplinary nature. An example of such a 
joint design can be found in [2], where the controller synthesis and communication 
rate allocation is solved jointly with an iterative method. 

We consider a networked control system as in Fig. 8.1. Multiple control systems 
coexist and their feedback loops are closed over a shared wireless network. Each 
control system has a centralized controller. Both the sensor measurements and the 
control commands need to be communicated wirelessly. Our results in this chapter 
are closely related to our previous work [3] [4] [5] [6]. In [3] we studied the optimal 
Kalman filtering updates in the presence of random partial observation losses and the 
convergence properties as a function of packet loss probabilities. The communica- 
tion design trade-offs in the link layer and the medium access control (MAC) layer 
were evaluated in [4] and [5], respectively. Cross-layer design issues of the wireless 
network for distributed control applications are discussed in [6]. Other related work 
includes [7] [8] [9] [10]. 
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Sensor A 


Sensor, Sensor 


Sensor Controller 
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Fig. 8.1. Networked control over wireless. 
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We cast the joint control and communication design problem in a broader frame- 
work of cross-layer design. Cross-layer network design has recently been applied to 
many applications, such as video over wireless [11] and sensor networks with energy 
constraints [12]. Different aspects of cross-layer design in wireless ad hoc networks 
are considered in [13] [14] [15]. We use a cross-layer framework for the joint control 
and communication problem as it allows each layer of the network protocol stack to 
be optimized relative to the end-to-end control performance. We will specifically in- 
vestigate the interaction of the physical layer design, the MAC protocol choice, and 
the controller sampling period within our cross-layer design framework. 

The remainder of this chapter is organized as follows. In Section 8.2 we present 
the LQG optimal controller design in the presence of random delays and packet 
losses by proving the separation principle and finding the modified Kalman filter 
updates with partial observation losses. In Section 8.3 we explain the layered struc- 
ture of data networks and the cross-layer network design framework for networked 
control applications. In Section 8.4 we describe our wireless network model. In Sec- 
tion 8.5 we illustrate our iterative cross-layer design of the link layer, MAC layer, and 
sample period selection with a double inverted pendulum system. Our conclusions 
and discussion are given in Section 8.6. 


8.2 Control System Model and Optimal LQG Control 


We assume all the plants in our model are continuous-time linear time-invariant 
systems and we represent the nt” system with the following state space equations: 


3 





x<"> (t) = AS"? x<"? (t) fe BS" us"? (t) +4 ws"? (t) 
y<"* (t) = O EA L Ver" (dt). 


Here x<”> (t) is the system state, w<"~(t) is the disturbance acting on the plant, 
u<”> (t) is the control force, y<"~ (t) is the measured output, and v,<"~ (t) is the 
measurement noise. All boldface variables are vectors. 

There are many control performance measures that can be considered and the 
impact of imperfect communication for different measures can be different. We con- 
sider a linear quadratic cost function as our performance measure. Specifically, we 
want to minimize 


M 
Jue = > jim Es mgo a (8) + u'< (QQSP UL”? (0, 
n=1 


where the weight matrix Q<"> > 0 and QJ”? > 0, and x’ is the transpose of 
x. We can tune the system performance by choosing different Q”? and QS">. 
Note that the controllers are discrete-time and the control output is transformed to 
a continuous-time signal via a zero-order hold. Thus the closed-loop system is a 
sample-data system. Minimizing the linear quadratic cost function we consider is 
equivalent to minimizing the generalized Hə norm (often abbreviated as the Hə 
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norm) with proper transformation [16]. Since all the systems have the same state- 
space representations, we drop the superscript < n > except when needed for clari- 
fication. 

Different control systems share one wireless network and interact with each other 
only through the network sharing. The network design determines the packet delay 
distribution and packet loss probability that affect the control performance. If the 
delay distribution and packet loss probability are known, we can decouple the design 
and analysis of different control systems. In the rest of this section, we assume that 
the delay distribution and packet loss probability are known, and focus on the design 
and performance evaluation of a single closed-loop system. 

In the rest of this section, we first state our assumptions, under which the sepa- 
ration principle of estimation and control is proved even in the presence of random 
delays and packet losses. The state feedback controller with random control packet 
losses is derived and the modified Kalman filter updates are shown to be optimal for 
state estimation with random observation losses. Lastly we summarize our control 
design and show how to use Markovian Jump Linear System (MJLS) techniques to 
evaluate the system performance. 


8.2.1 Assumptions and timing issues 


New Sensor Measurements generated 























| 1 
1 Old Packets Dropped 
| Time Slot {t 
= jp— 
AUN ai 
time 
= kasi 


1 Sample Period 


Fig. 8.2. Timing diagram. 


We illustrate our timing assumptions in Fig. 8.2. Time is evenly slotted and there 
are multiple time slots within one sample period. New sensor measurements are gen- 
erated at the beginning of each sample period and old packets are dropped when 
new packets are generated. A packet is declared lost if it has not been received by 
the end of the sample period. Within one time slot, exactly one data packet can be 
transmitted. The packet may be corrupted during the transmission due to noise and 
interference. Thus the receiver may not be able to decode the packet. Retransmis- 
sions are allowed when there are enough time slots in one sample period. In this 
chapter, we ignore the propagation time and processing time,? and only consider re- 


? The propagation delay and processing delay are roughly constant. We can also model this 
delay into the plant dynamics. 


8 Cross-layer Design of Control over Wireless Networks 115 


transmission delays. Due to the slotted nature, the delay only takes discrete values 
that are in multiples of the length of a time slot. Note that all delays are bounded by 
one sample period since packets are dropped at the end of every sample period. 

We assume that neither the sensors nor the actuator have computational capabil- 
ities. Thus the sensors can only send the measured data, but not a function of the 
measured data. For the ease of analysis, we assume that the actuator updates at most 
once per sample period. If no control command is received in a sample period, the 
actuator continues to use the previous control command until a new one is received. 


























S,|S2}C | Sif S2| C | Sy) S2 e[s; S| C S1: Sensor 1 
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Fig. 8.3. Timing illustration with TDMA. 


Depending on the controller design, the control command may depend on the 
time delay and the sensor measurements available to the controller. The controller 
can either wait till the receipt of all sensor measurements before any transmission 
to the actuator, or, in some cases, the controller can choose to send a control com- 
mand whenever the channel is available. ° For example, in a Time Division Multiple 
Access (TDMA) network, each sensor and controller takes turns to transmit in a pre- 
determined order as shown in Fig. 8.3. In this figure, two control systems share the 
network via TDMA. Each system has two sensors and one controller. If the con- 
troller does not transmit in its given time slot, the time slot is wasted anyway. Thus 
the controller should send a predicted control command based on the current infor- 
mation. In the event of no further control commands received at the actuator, this 
control command can be used. In this scenario, the actuator may receive more than 
one control command in a sample period, but we assume that the actuator updates 
at most once. So the actuator needs to set up certain rules to decide which control 
command it uses to update. This is also part of the controller design and our design 
is explained in Subsection 8.2.4. 

We also assume that the actuator sends an acknowledgment (ACK) packet to 
the controller when a control command is successfully received by the actuator. The 
ACK packets are often small and thus require little network resource. We assume 
that the ACK from the actuator to the controller experiences no loss with negligible 
delay. This assumption of timely and lossless ACK on the control command is an 


3 We could also have the controller wait a certain amount of time before it transmits to the 
actuator. 
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important one that allows the separation of estimation and control as we shall see in 
the next subsection. 


8.2.2 Separation of estimation and control 


The separation principle was shown to be suboptimal in decentralized control 
by a well-known counterexample [17], where different distributed controllers have 
access to different information sets. It was argued in [18] that the separation principle 
does not hold in centralized control when the controller does not know if the previous 
control commands are received or lost. However, the separation principle has been 
proven to be optimal under certain assumptions in networked control systems. Gupta 
et. al. [19] proved the optimality of the separation of estimation and control when 
the sensor and the controller communicates over a packet-dropping link. The authors 
assumed no packet losses from the controller to the actuator. Nilsson [20] proved the 
separation principle with random bounded delays in the feedback loop assuming no 
packet losses. 

Under our assumptions, the controllers do know if a particular control command 
is received by the actuator and at what time it is executed. In this subsection, we will 
prove the optimality of the separation principle when both random delays and packet 
losses are taken into account. 

We first review Nilsson’s result on the optimal state feedback controller with 
random bounded delays. We then generalize the result to account for packet losses 
from the controller to the actuator. Lastly we study the problem of output feedback 
control with random packet losses in the sensor measurements and we prove the 
optimality of the separation of estimation and control. 

Let T4 denote the delay of the control command in the kt” sample period. This 
delay is defined to be the time from the beginning of the kt” sample period to the 
time that the control command is executed by the actuator. The sample period is h 
and 0 < Tk < h. We assume that 7; is available to the controller and will justify this 
assumption in Subsection 8.2.4. By discretizing the plant dynamics, we get 


Xk+1 = Oxy + Io(Tk)Uk + Pi (7e)UR-1 + We, (8.1) 
Yk = Xk + Vsk, l 
where & = e^}, To(Tk) = ra e^5dsB, and Ti (Tk) = L e^5dsB. 


At time N we want to minimize a linear quadratic cost function 


N N 
‘4 1 / 
Jy =E X u,Quuz + X XkQrXk t XN41QNXN+1 
k=0 k=0 


With state feedback, we have yy = x, and we first assume no packet losses 
anywhere in the feedback loop. The optimal state feedback controller is found in 
Theorem 5.1 in [20]. We restate it here in our setting. The control law that minimizes 
our cost function is given by 
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Up (Tk) = —Le (Te) | ak | (8.2) 


uUk—1 


where 


Ly (te) = (Qu + 822 (Tk)! [821 (re) $23.4 (74) | » 
= Hr) Sees, 


ain) = [ERE Blo 


0 [% 2 9, | Fults) + Fire )Sisa(n) Fam) 


Here Bea (rx) is the block (i, j) of the symmetric matrix Sk+1 (Tk). 

Now let us consider the state feedback problem with the control command sent 
to the actuator over a wireless link. Thus control command packets can be lost. We 
let up = u,_1 when no control command is received in the k*” sample period. We 
want to find the optimal controller when the control command is received with delay 
Tk Where 0 < Tk < h. 

The optimal controller can be solved by extending the state space of 7; in the 
previous theorem. When a control command is lost, we let 7 = oo. Correspond- 
I0 1o 
ingly, we have F\(oo) = f "| and F5(0oo) = | 0 T | . Since control packets are 

OL 
declared lost at the end of the sample period, we can think of it as a control update of 
uy = Uz_ 1 With a delay h. Thus Sp41(00) = Šk (h). This leads to the following 
theorem. 


Theorem 1. When the control command is lost with probability pı, the optimal state 
feedback controller is given by: 


in (Th) = —Le (Te) Be | (8.3) 


U‘-1 


for 0 < Tk < h, where h is the sample period and 


Lr (Tk) = (Qu + SRi) [SR aC) Skia (te) > 
Sk+1lTk) = G (Tk)Sk+1G (Tk), 
GE i me) ey , 
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ee eee [Fin k A Rept PSD) 


arla eA | Filo0) + FS (00)Si4a AFCO) | 


0 Qu 
rime [Ly]. i= [29]. 
I 0 I0 
Fa(Tk) = | —Lr(Tk) |, Fo(oo) = | OT], 
0 I OL 
sel T 


Note that the iteration may not converge when the probability of loss, pı, is too 
large. This is because the controller cannot update often enough due to packet losses. 
An upper bound on p; that guarantees stability can be found via standard MJLS 
stability criteria [21]. 

We now consider the general output feedback case where both sensor measure- 
ments and control commands need to be sent over a shared wireless network. Thus 
both the sensor measurements and the control commands can be lost randomly. Let 
I* be the information set that is available to the controller at the time of computing 
the control command in the kt” sample period. Note that J* is a random set that 
depends on the packet dropping patterns in the current and previous sample peri- 
ods. The maximum possible set of I* is {y0, Y1,- - -, Yk, W1,.--,Up—1}. Based on 
our assumption, the controllers always know the past controls {uy,...,uz—1} and 
the corresponding execution times {71,...,7—1}, but the controllers may only have 
a subset knowledge of the observation {yo,yi,---, yx}. In the next theorem, we 
prove the separation principle is optimal for the output feedback problem with ran- 
dom packet losses. Hence the separation principle still holds even in the presence of 
packet losses. 


Theorem 2. For the output feedback control problem, the optimal controller that 
generates a control command to be updated at time kh + Tp with information set I" 
. Ex. a) 


(8.4) 


Uz (Th, Ik) = —Lk(Tk) | ür 


where Ly is calculated as in Theorem 1 and E{x;|I"] is the state estimate given the 
information set I®. 


Proof. We prove the separation by writing the iteration of the quadratic cost function 
as 
Jeg = Jk + Ew, Sp wet 7 
Eng re hE Ty |r (Uk (Te, Le) — He (Te) (Qu + Se (Te) (Ue (TH, Le) — Be (TH) 
where 
zs Xk 
Ux (Tr) = —Lk(Tk) | | 


UR-1 
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is the optimal control with perfect state information x;. 

To minimize J41, the optimal control ux (Tg, J) is the minimum mean square 
estimate of U;(7,). Recall that Jẹ is the information set available for calculating 
Uk (Tk, Ik). Since U, (7) is a linear function of x, and uz—1, and u,—1 is known, 
the minimum mean square estimate of 0;,(7;) is 


= ; E[x,|I* 

Uk (Tk, Ik) = Ele (7)|L"] = —Le(Te) | oS j ' 
where E|x;,|I*] is the minimum mean square estimate of the state variable given the 
information set 7. Note that this holds regardless of the information set [ k which 
is probabilistic due to the random packet losses of the sensor measurements. 














This proof extends the separation principle to the case of partial observation 
losses and control command losses. The optimal controller can be separated into 
two cascaded parts: the state estimator and the state feedback controller. The state 
feedback controller can be computed using Theorem 1. The optimal state estimator 
needs further investigation. In the next subsection, we show how the Kalman filter 
can be modified to adapt to sensor measurement losses. 


8.2.3 Kalman filtering in the presence of partial observation losses 


In this subsection, we show how to use a modified Kalman filter to calculate 
E|x;|I*] when partial observation losses are possible. For illustration purposes, we 
assume that the observation vector yx is divided into two parts [Y1,k; y2,k] and each 
part is encoded and sent separately. All the results can be similarly extended to the 
case where the observation is composed of more than two parts. Since the past con- 
trols and their execution times are known at the state estimator, it is sufficient to 
consider the system with the following dynamics: 


Xk+1 = Oxy + Wk, 
Vik Cı V1,k 
, = Xk + i 8.5 
i P bed i Go) 
where x, E€ R”,Y1.k, Vik E R™, and yon, V2,k E R™?. The system matrices are 
of the appropriate dimensions. The covariance matrices of vı k and v2, are R11 and 
Rəz respectively. Comparing with the system dynamics in Equation (8.1), we have 
; C. : : ; Riy R 
Yk = Vik v= 1 |. Note that the covariance matrix of v is R = eee 
Y2,k C2 Rai Roe 
and the covariance matrix of w is denoted as Q. We assume the system (®, C) is 
observable, thus the Kalman filter converges without sensor measurement losses. 
The measurement outputs y1,k, Y2,x are encoded separately and sent over differ- 
ent wireless channels in time step k. We use +;,,, to indicate whether y; ;, is received 


correctly in time step k. We assume 71,4 and 72,, are i.i.d. Bernoulli random vari- 
ables for all k with Pr(y1,, = 1) = àı and Pr(y2,4 = 1) = A2. Note that \; and 
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Ag represent the percentage of the sensor measurement packets that are correctly re- 
ceived. Also note that 1 and 2 are proportional to the link throughput from sensor 
1 to the Kalman filter and the link throughput from sensor 2 to the Kalman filter, 
respectively. We refer to the pair (Ai, 2) as the network throughput, which depends 
on the channel gains, the network traffic and the network resource allocation (such 
as power, time slots, etc.). 

We also assume that 71, and 72, are independent for every k and l. Thus, y1,k 
and y2, can be independently lost or received. When an observation is lost, it is 
equivalent to receiving the measurement with an infinite noise variance. The mea- 
surement noise v; ;, is assumed to have the following conditional probability density 


distribution: x 
f . (0, Rii) if Yi,k = 1, 
P(vinel Yo) ve 071) if yik =0, 


where we take o? — oo when the observation y;,;, is lost. We assume that the cross- 
correlation terms Ry2 and R21 do not change as a function of 71,, and y2,%. In fact, 
the measurement noise at different sensors are often uncorrelated. 


Let Vk = [V1,K3 72,1; yë = {70, e. Yeh; and y5 = {yo, te Lyn}. We define 


(8.6) 


Xklk = E[xuly6, 70], 

Poje = El(xn — kjk) (Xe — Kee) 1], 
Rki = Elxesilyd, vë], 
Pryije = El(Xeo1 — Rktijk) (Xk+ — Rkir) |8]. 





The time update of the Kalman filter is independent of the observation process 
and thus stays the same as in the Kalman filter with no packet losses, 


Žk+1i|k = PRkjko 


8.7 
Pyrite = PPh? +Q. n 





But the measurement update is now stochastic since the received measurements now 
depend on the random variables Ņy1,% and y2,x. 

When 71,4 = 1, Y2, = 1, the complete observation measurements are received. 
Thus, the measurement update is the same as in the Kalman filter with no packet 
losses. 


Ketijk+y = Êkpijk + PrrsjeO’[CPepieC’ + R] (veri — C&e+iyx); 


= 8.8 
Prsajeta = Perse — Ppi [C Pky + R] tC Prti. (8.8) 




















When 71,4 = 0, Y2, = 0, the optimal measurement update is to run one step 
open loop. This also corresponds to the case of no observation in [22]: 


ÎRk+ijk+1 = Xk yk 8.9 
Pk+ijk+1 = Pk+ilk- a 











When 1,4 = 1, 72,4 = 0, only y1,x is received by the Kalman filter. The corre- 
sponding measurement noise covariance matrix is now 
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~ [Ru R] _ 0 0 
fl eel are] (8.10) 


With the observation y;,, only, the Kalman filter updates assuming the noise covari- 


ance is R: ` , I 5i À 
Kepaet1 = Xetra + Phi [CPep ye’ + R] (Ver — CRktijk), 
Prsijngt = Peije — Ppi O' [C Phak" + R] CPessje- 


Ciexc’ + RME 





(8.11) 








Note that -i 
n ; 0 0 
=C (cxe +R+ net eal) G 
(a) 0 0 = 
a 1 1 
= C (cxe +R+ Io ear) C 
(6) C! aa = Mi2M5 Mar 41] C 
0 0 
© Cc’ (CXC + Ri)! 0 C 
0 0 
d 1 7 
O CXC + Rul Gi, 
where [CXC +R] = po | l 


and (a) follows since 73 — oo, (b) is derived by using the low rank adjustment of 


the matrix inversion formula [23] and taking 72 — ov, (c) is due to the alternative 
formula of the inverse of a partitioned matrix [23], and (d) is derived by simple 
multiplication of the partitioned matrices. 

Therefore, for 71 ;, = 1 and y2,, = 0, the measurement update is 


Kepspke1 = Šk+1|k + PrsajeCy [C1 Pe+1jeCy + Ral t (yik+1 — C1k+1ļk) 
Pk4ijk+1 = Pras — PrsayeOr[Ci PesijeCr + Riu] Ci Pkt 














(8.12) 
Note this is equivalent to the classical Kalman filter measurement update if yı were 
the only observation. This is somewhat surprising because it seems that the Kalman 
filter does not distinguish between a packet loss and a non-existing sensor measure- 
ment. On the other hand, the result can be expected because the Kalman filter is now 
stochastic and it only depends on the current packet arrivals. 
Similarly, when 71% = 0, y%2,k = 1, the Kalman filter updates as if y2 were the 
only observation: 


Rk+1jk+1 = Xk+1je + Prete [C2Pi+16C2 + R22) + (Y2,k+1 — C2fk+1]k), 
Popajk+1 = Presale — PkpijeCalC2Pktik C2 + R22) tC2Pkpilk- 











(8.13) 
Let Py = P,jk—1- Combining (8.7) (8.8) (8.9) (8.12) (8.13), we get: 


Peyi = BPP! + Q — 1,472,6P PRC’ (CPRO + R) +O PD 
—1,4(1 — Y2,k) BPC (C1 PCi + Ru) 1C P9 (8.14) 
—(1 — 1,6) Y2 KBPC (C2 PLC, + Ro2) 1C P8. 
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As we have shown, the Kalman filter updates become stochastic when part or all 
of the observation measurements can be lost randomly. Due to the stochastic nature, 
we no longer have a unique deterministic error covariance matrix in the steady state. 
We define 


gaza (X) = PXP +Q — AA2OXC'(CXC' + R) OXD 
—A1 (1 = )OXC (GAC, + Ru) tO XP 


—(1 — Ay oO NOON CO, + R2) OX, (8.15) 

as a useful shorthand since 
E[Px41|Pe] = 9r1d2 (Px), (8.16) 
and E[ Pri] = Egy àa (Px)]- (8.17) 





We now state the main theorems of our results on Kalman filtering in the pres- 
ence of partial observation losses. The detailed development and proofs can be found 
in [3]. 

The convergence of the iteration Pray = 9r1r2 (Pr) guarantees the boundedness 
of Pù for any k. The first theorem establishes the condition under which the iteration 
Py = gy2 (Px) converges. It also proves the uniqueness of the solution when it 
does converge. 





Theorem 3. Suppose 3 a matrix P > 0 such that P > gy,y,.(P). Then: 
(a) VPo > 0, the iteration Py = 9, .(P,) converges and 
jim P, = jim n JA a (Py) =P (8.18) 
independent of initial condition Po; and E s 
(b) P is the unique positive semidefinite solution of Pk41 = gy1) (Pk). 


The steady state error covariance matrix P can be computed efficiently by solv- 
ing a semidefinite program (SDP). It can be shown that P is an upper bound of the 
expected error covariance. The next theorem shows the existence of a stability re- 
gion boundary such that the expected error covariance matrix goes to infinity if the 
throughput pair is less than the rates specified by the boundary. 


Theorem 4. Assume (®, Q) is controllable and (®, C) is observable. Fix 0 < Ai <1. 
If Pei = Yd ro (Px) is unstable for 2 = 0 while stable for \2 = 1, then 3 X§ with 
0 < AS < 1 such that 





jim E[Px] = œ for 0 < A2 < AS, 


and there exists a positive semidefinite matrix Mp, > 0 as a function of the initial 
condition Py > 0 such that 


E[Px| < Mp, Vk for AS <à <1. 


If Pra = JM) (Px) is unstable for the given X; when àz = 1, then XS = 1. If 
Pri = gaia (Pp) is stable for the given 1 when Az = 0, then S$ = 0. We will get 
the same stability region boundary if we fix Ao and vary 4. 
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Even though this boundary cannot always be found, the upper and lower bounds 
of the stability region boundary can always be found by solving a feasibility problem 
of a Linear Matrix Inequality (LMI). 


8.2.4 Controller design 


The previous subsection shows that the optimal LQG controller with random 
packet losses has two cascaded parts: the Kalman filter and the state feedback con- 
troller. The Kalman filter calculates the minimum mean square error state estimate 
based on received sensor measurements. When all the sensor measurements are re- 
ceived, the classical steady state Kalman filter is used. When none of the sensor mea- 
surements are received, we can have the Kalman filter run a one step forward open 
loop, and this also gives the optimal state estimate. When only part of the sensor mea- 
surements are received, the Kalman filter updates as if the received measurements 
are the only measurements taken. Note that the optimal Kalman filter is a function of 
the previous error covariance matrix, which depends on the whole history of packet 
losses. This makes the computation of Kalman filter gains highly complex. We cal- 
culate the steady state error covariance matrix of the iteration Pua = = Orda (Pr). 
We use this covariance matrix P to calculate the Kalman filter gains. Recall that P 
is an upperbound of the expected error covariance matrix. Thus the filter does not 
output the optimal state estimate but the performance degradation is minimal. The 
Kalman filter gains now only depend on the sensor measurement losses in the current 
time slot and thus are much easier to compute. 

The state feedback controller is a function of the total time delay in the feedback 
loop. Thus, it is time varying. The total time delay is from the time when mea- 
surements are taken to the time when the actuator updates with the received control 
command. We assume that the control command is calculated based on the available 
sensor measurement information right before the transmission to the actuator. Re- 
call that we assume there is a reliable ACK for every transmission. Therefore, the 
controller knows the time delay of the control command if the next transmission is 
successful. If a control command is lost, its value does not affect the control system. 
Therefore it is reasonable to assume that the controller knows the delay 7;, at the time 
of computing up. 

Upon receiving a control command, the actuator needs to decide if it should up- 
date with the received control command. We assume that there is an indicator bit 
in the control command packet that tells the actuator whether the control command 
is computed based on full measurement information or partial information. When 
the actuator receives a control command that is computed based on full observation 
information, the actuator updates itself and does not update until the next sample 
period. In case of receiving a control command based on partial measurement infor- 
mation, the actuator will hold the command and only update with this command if no 
further control command is received by the end of the sample period. Such an update 
only occurs at the end of the sample period. When no control command is received, 
the actuator continues to use its previous command until a new one is received in the 
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next sample period. Note that this is just our design and we are not optimizing this 
part of the controller. We hope to study this problem in the future. 


8.2.5 Performance evaluation 


We evaluate the system performance by modeling the closed loop system as a 
MILS. Define the augmented system state vector, x(k) = [k(k);x(k|k — 1); y(k — 
1); u(k—1)] and the joint noise vector w(k) = [w(k); v(k)], where v(k) = vs(k)+ 
Vq(k) and v,(k) is the measurement noise and vą(%) is the quantization noise. Note 
that x and w are the discretized state and disturbance and x(k|k — 1) is the Kalman 
filter state estimate. We choose the Markovian state r = (D, Sc, Sm) where D is the 
time delay in the control command, se indicates the control command loss and the 
sensor measurement information available to the controller at the time of command 
calculation, and sm indicates the sensor measurement loss at the estimator: 


0 control command is based on full observation, 
1 control command is based on yı only, 
Se = 4 2 control command is based on y2 only, 
3 control command is based on no new measurements, 
4 control command is lost. 
0 no sensor measurement losses, 
1 sensor measurements yı received but yə lost, 
2 sensor measurements y2 received but yı lost, 
3 all sensor measurements are lost. 


Note that for all D < T, where T is the sample period, we always have se = 
Sm = 0 while when D = T, we can have se = 0 and sm = 0, or se = 1 and 
Sm = 0,1, or Se = 2 and sm = 0,2, or Se = 3 and sm = 0,1,2,3, or Se = 4 and 
Sm = 0,1,2,3.4 Therefore we have L+ 12 Markovian states, where L is the number 
of time slots in one sample period. We can write the system in the form of a MJLS 
as X(k+1) = F,x(k) + G,w(k) forr = 1,2,..., +12. The system matrices Fp, 
G, can be easily derived. Let X, (k) = E X(k)x(k)’, then 





L+12 L+12 
Dslk +1) = XO oF Eelk) F+ YO GG, 
r=1 r=1 


where q, is the probability that the MJLS is in state r. As k — on, it can be 
shown [20] that a unique steady-state covariance matrix Xy = liMk—oo Xz(k) exists 
when the recursion is stable. We can now evaluate the linear quadratic cost function 
since Jroa = Trace ([Q 00 0] Yn) + Trace ([0 00 R] Zn) . 

We have found the optimal controller that adapts to the random delays and packet 
losses in the feedback loop. In the following sections, we will study the network 
design issues in order to minimize the performance degradation. In the next section, 
we first motivate why a cross-layer framework is necessary. 


4 The delay distribution that we use to calculate the state feedback controller is exactly the 
distribution of D when a control command is received. Note that Pr(D = T) sums up 
nine different probabilities. 


8 Cross-layer Design of Control over Wireless Networks 125 


8.3 Cross-layer Network Design for Distributed Control 


A layered network architecture is central to most data network designs. Layer- 
ing provides design modularity that facilitates standardization and implementation. 
An international standard of an Open System Interconnection (OSI) model includes 
seven layers from top to bottom: the Application layer, the Presentation layer, the 
Session Layer, the Transport layer, the Network Layer, the medium access control 
(MAC) layer, and the Physical layer. Traditionally, each layer is designed separately 
with control messages passing between adjacent layers. The idea of cross-layer de- 
sign is to jointly design these different layers. Cross-layer design can imply a joint 
design across all network layers simultaneously, which is highly complex. Alterna- 
tively, it can entail choosing parameters or protocols at different network layers from 
existing designs in a joint fashion, which is our design approach. The goal of cross- 
layer design is to provide the best possible end-to-end performance of the applica- 
tion. Application examples include voice, video, web browsing, and high speed data 
transfer. Cross-layer design has shown significant performance benefits for applica- 
tions with hard delay constraints, such as video [11]. In joint control and network 
design, the application is control. 

We consider a simpler four-layer architecture for cross-layer design to illustrate 
the benefits in control applications. These layers are shown in Fig 8.4. The physi- 
cal layer defines a point-to-point communication link. The MAC layer defines how 
the channel is shared among multiple transmitters. The network layer implements 
routing and flow control for the network. The application layer supports distributed 
control. Therefore, we consider the control system design as parameters in the appli- 
cation layer of our four layer network model. 


Application Layer Control Parameters: 
Sample Period, Performance Index, etc. 


Network Layer Routing Algorithms, Flow Control 


Medium Access Control Protocol 


| 


Link/Physical Layer Modulation, Coding, etc. 


MAC Layer 





Fig. 8.4. Layered structure of wireless network. 


The goal of the Network, MAC, and Link layers is to optimize control perfor- 
mance. This performance is a complicated function of the packet delay distribution, 
the probability of packet loss and the data resolution associated with the network. 
Note that the average delay, which is often used as a performance metric in other 
wireless systems, is a useless metric for control applications since the closed loop 
system performance depends on the full delay distribution, not just on the average 
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delay. The link design, the MAC protocol, and the routing algorithm jointly affect the 
delay distribution and the packet loss probability. The sample period of the control 
system is considered as a parameter of the application layer. The sample period de- 
termines how often new packets are generated and when the old packets are dropped. 
Thus the sample period affects the network traffic, which in turn affects the delay dis- 
tribution and packet losses. Therefore, it is important to design the parameters of the 
MAC layer, link layer, and controller jointly. However, it is difficult to quantify the 
impact of a network design on the control performance analytically for all control 
systems. This is because the control performance is an implicit and intricate function 
of the network parameters. We use a numerical example with classical inverted pen- 
dulum systems to illustrate our framework for cross-layer design and its associated 
performance gains. 

Cross layer network design over all layers is a very challenging problem. In par- 
ticular, it is difficult to simultaneously optimize all the layers, which motivated the 
OSI model in the first place. Thus, we study a suboptimal iterative method for cross- 
layer optimization over a subset of the network layers: the link layer, the MAC layer 
and the application layer (sample period selection). In particular, to jointly design 
the MAC protocol, the link design and select the optimal sample period, we first fix 
a sample period and a MAC protocol and choose the best link layer design. For this 
link design and the sample period, we choose the best MAC protocol. The third step 
is to optimize the sample period for the chosen link and MAC protocol design. We 
then iterate the algorithm until it converges. It is important to point out that we opti- 
mize the controller design for each network design choice. Even though this network 
optimization is only based on just a few protocol parameters, it can yield significant 
performance gains and insight as we show in Section 8.5. In the next section, we 
describe our wireless network model. 


8.4 The Wireless Network Model 


Vailk) ni(k) 


Source Encoder Channel Encoder Transmit 


Buffer 
Uniform 
AE A H eccecre | =T} Mop b}—+} DEMOD—* Decoder |- ao 















































Transmitter Wireless Channel Receiver 


Fig. 8.5. Wireless communication link model. 
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8.4.1 Wireless channel model 


We consider a discrete time channel with stationary, ergodic, slowly time-varying 
gain ,/g;(k) and additive white Gaussian noise (AWGN) n;(k), where the subscript i 
refers to the i*” link and k refers to the k*” time instant. We assume the channel gains 
are static. This is justified by the assumption of very slow fading where the channel 
coherence time (the time over which the channel remains roughly constant) is long 
enough so that the control system converges to steady-state within a coherence time 
interval. We assume that the channel power gain g;(k) is independent of the channel 
input and the transmission power P; does not change as the channel gain varies. 


8.4.2 Wireless link model 


Different link layer design choices (coding, modulation, etc.) lead to different 
performance in terms of data rate and probability of error [24]. We assume a simple 
class of communication link designs as shown in Fig. 8.5. The figure shows the 
wireless link from a sensor to a controller. We assume the same link model for all 
the wireless links including the links from controllers to the actuators. 

Each transmitter is assigned a unique ID number and this ID number is attached 
to the data (sensor measurement or control command) that needs to be sent. As- 
sume we have M transmitters, then each ID number consists of [log, M] bits. At 
the transmitter, the data is first quantized and converted into a binary bit stream via 
a uniform quantizer. The bit stream, piggy-backed by the sender ID number, goes 
through the channel encoder that uses BCH codes for error correction and then a 
CRC (Cyclic Redundancy Check) for error detection. The effect of undetected er- 
rors can be disastrous in control applications since the actuator will use an erroneous 
control command as the correct one. We use a 16-bit CRC for which the probability 
of undetected errors is roughly 2716, which is less than 0.01%. We thus ignore the 
effects of undecoded errors since the probability is negligible. We use either BPSK 
or QPSK modulation at the transmitter. At the receiver, we assume coherent detec- 
tion of the PSK signals. The BCH decoder can correct some transmission errors 
depending on the number of error correcting bits. After error correction, the receiver 
performs the CRC checksum for error detection. When no error is detected, the re- 
ceiver sends an ACK back to the transmitter. If the transmitter receives the ACK, it 
clears its transmit buffer and does not transmit until a new packet arrives. Note that 
retransmissions are allowed if there are extra time slots. We assume that the transmit 
buffer only has a capacity of one data packet. Thus a packet will be discarded? if it 
has not been successfully received by the end of the sample period. Therefore, if a 
packet is successfully received, the packet delay is bounded by one sample period. 

From the control perspective, the relevant communication parameters are data 
rate, time delay and probability of packet loss. Thus we can simplify the link model 
as in Fig. 8.6. This simplified model is sufficient to calculate all the communication 


5 In a control system, a new measurement is always more valuable than old measurements. 
Each transmitter only needs to send the newest data available. 
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parameters that may affect the control performance. The covariance of the quantiza- 
tion noise v,,; is a function of the number of data bits representing the signal, and 
thus depends on the data rate. Both the time delay distribution and the probability of 
packet loss are determined by the MAC protocols, total number of retransmissions, 
and probability of successful transmission p,. The probability of successful trans- 
mission p, for each packet can be easily calculated given the link design, wireless 
channel gain and transmit power. 


JS (k 
( ) i P(ug(k) = Gf (k)) = ps c 
yilk) k P(gi(k) is lost) = 1 — ps ! vë (k) 


th ETS OEE NEE EO E EE 




















Fig. 8.6. Simplified model of the communication link. 


8.4.3 MAC protocols 


A common transmission scheduling protocol is TDMA (Time Division Multi- 
ple Access). TDMA is a collision-free protocol in which time slots are assigned in 
advance and never changed. We consider fixed TDMA and assume that time slots 
are divided evenly among all the transmitter/receiver pairs. Since the time slots are 
pre-assigned, a time slot can be wasted if the pre-assigned transmitter no longer has 
a packet to send. 

We also consider a contention based protocol Random Access (RA) with ACK. 
The ACK is a small packet that is sent back to the transmitter upon a successful 
transmission. We assume no spatial reuse and any two simultaneous transmissions 
will collide and cause packet losses. With RA, each transmitter attempts to grab 
the channel independently with a probability of p at any given time slot. With ACK, 
the transmitter does not send redundant packets for the information that is already 
successfully decoded. 


8.5 Numerical Example 


The cart with an inverted pendulum, shown in Fig. 8.7, is controlled with a force, 
F, to cancel the random disturbance w and maintain the pendulum in an upright 
position. We use x to denote the cart position coordinate and 0 as the pendulum 
angle from vertical. 

For this example, we assume two identical inverted pendulum plants with the pa- 
rameter choices as listed in Fig. 8.7. The state of the system is chosen as [a(t), «(¢), 


6 In practice the receiver may be able to decode one of the messages even with interference 
from another. This capability is called the capture effect [25]. 
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\\ Mass of the cart: M = 0.5 kg 
ANNO 
m, ENO Mass of the pendulum m = 0.2 kg 
\\ Friction of the cart b = 0.1 N/m/sec 
F Length to pendulum center of mass 1 = 0.3 m 
M, Inertia of the pendulum J = 0.006 kg * m? 
w 
() () 
ss S 








Fig. 8.7. Inverted pendulum and cart. 


O(t), Ò(t)]. The system dynamics are not linear in 0. We assume the pendulum 
does not move more than a few degrees away from the vertical and linearize the 
system dynamics about 6 = 0. We can thus get the standard linear model for 
the inverted pendulum. We would like to minimize the linear quadratic cost func- 
fon with Q='"(1,1) = 15,05" (3,3) = 25,0- = Land Q7 (1,1) = 
1, Q? (3,3) = 100, Q<?> = 1, where all the unspecified elements in Q<‘> are 
zero. These weight matrices Q<!>, Q<?> and Q<!*,Q<?> are chosen to reflect 
the different priorities of different systems and different signals. In this example, the 
second system weighs more on @ since its main goal is to keep 0 small while the first 
system gives roughly equal emphasis on x and 6. The measurement noise v,(k) is as- 
sumed to be Gaussian with zero mean and covariance matrix R = [10~*,0;0, 1076]. 


8.5.1 Link layer resource allocation and design tradeoffs 


We first show how different link designs affect the control performance. We as- 
sume TDMA in which the parameters for each system are transmitted sequentially 
as illustrated in Fig. 8.3 earlier in the chapter. Since we have 6 transmitter/receiver 
pairs, the ID field for each packet is 3 bits. We also require a minimum of 4 bits 
to represent each measurement/control command and approximate the quantization 
noise as a Gaussian random variable. We use two modulation schemes: BPSK and 
QPSK. QPSK provides twice the data rate of BPSK but QPSK incurs a larger prob- 
ability of bit error for a fixed transmission power and bandwidth. We consider three 
different frame sizes: 24 bits, 32 bits and 48 bits. Note that there is a 16 bit CRC in 
each frame and we also use one bit guard time between each transmission. We use 
BCH codes for error correction. With a 32-bit frame, we can use (15, 11) and (15, 
7) codes, where the first number is the total number of coded bits in the codeword 
and the second is the number of information bits. The code rate is defined as the 
fraction of the number of message bits and the total number of bits in the code. With 
a 48-bit frame, we can use (31, 26), (31, 21), (31, 16) and (31, 11) codes. We also 
consider the cases where no error correction coding is used. We represent these cases 
by (7, 7), (15, 15) and (31, 31) for 24-bit, 32-bit and 48-bit frames, respectively. In 
this example, we assume 96 Ksymbols/sec, thus the data rate is 96 Kbps for BPSK 
and 192 Kbps for QPSK. The transmission power is 10 mW and the noise density is 
ME = 1078 W/Hz. We consider a static channel gain g = 0.2. 


130 


X. Liu and A. Goldsmith 


Impact of Link Layer Designs 
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Fig. 8.8. TDMA with different link layer designs. 


In Fig. 8.8, we plot the generalized Ho norm against different BCH codes for both 
BPSK and QPSK. The first graph plots the performance of 18 different link designs. 
This is a little hard to read so we illustrate the results with 3 sub-plots. The second 
and third graph plot the performance under a fixed frame size of 48 bits and 32 bits 
respectively. For a given frame size, QPSK allows twice as many time slots as BPSK 
and the transmission time of each frame is only half of the BPSK transmission time 
due to the doubled data rate. However, QPSK incurs a higher probability of bit error, 
which in turn leads to higher probability of frame error. The probability of frame 
error can be reduced if a strong error correction code is used. As we see from the 
second plot, QPSK performs better when the code rate is low since more transmission 
errors can be corrected. Both the second and the third plot show that BPSK performs 
better than QPSK when the code rate is high. The last plot in the figure shows the 
performance comparison for different frame sizes when no error correction is used. 
Smaller frame sizes lead to more time slots and smaller probability of frame error 
when no error correction is used. Therefore, less delay and packet losses can be 
expected. However, the impact of data resolution kicks in. For small frame sizes, 
we have few bits to represent the signal. Thus the quantization noise is big. This is 
why (15, 15) outperforms (7,7) for BPSK and QPSK. It is surprising that the overall 
best performer is BPSK with 32-bit frames and no error correction coding, even 
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though several other link designs perform only slightly worse. The reason is that 
this link design achieves the best tradeoff among the data resolution, time delay, and 
probability of packet loss in our control performance metric. 


8.5.2 Cross layer design 










































































1st Iteration: Link 1st Iteration: MAC 
5 
©. BPSK 
6.4 
7 o * > QPSK 
E 6.5% Š g ê? 
2 ; * 2 6 
N 6 : N 
= ME rono E 58 
5.5 * ai 5.6 
5 (15,15) (15,7) _ (31,26) (31,16) ine Se 
(7,7) (15,11) (81,31) (31,21) (31,11) i 0.15 0.2 0.25 0.3 
Channel Access Probability p 
1st Iteration: Sample Period 2nd Iteration 
6 7.5 
-0O = BPSK 
5.9 7 : “ke QPSK 
Q 
g 38 E 6st * © 
2 57 2 
N o 6 . z $ 
T 56 of = 4 Os. Bs 
9 ©. ro) *. 
5.50. ô > 9 5.5 * KK 
54 oo 9 p 15:18) (15,7) (81,26) (31,16) 
i 4 6 8 10 12 (7,7) (15,11) (31,31) (31,21) (31,11) 


Sample Period (msec) 


Fig. 8.9. Cross layer design: the link, MAC and application layer. 


With Fig. 8.9, we illustrate the procedure of an iterative cross-layer design for 
the link layer, the MAC protocol, and the sample period. Our link design parameters 
are the modulation scheme, the frame size, and the error correction coding. We use 
RA with ACK as the MAC protocol and we vary the channel access probability p 
in order to optimize the performance. We assume all senders have an equal channel 
access probability and use the same link design for simplicity. The application layer 
parameter is the sample period. We start with an initial sample period of 5 msec and 
the channel access probability p = E, The first plot in Fig. 8.9 shows the control 
performance as a function of different link designs. The optimal link design is QPSK 
and (15,11) BCH codes in 32-bit frames. We keep this link design and optimize 
the MAC protocol within the class of RA with ACK for the initial sample period 
of 5 msec. The second plot shows the Hə norm as the function of the channel ac- 
cess probability in RA with ACK. The control performance first improves (Hz norm 
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decreases) and then degrades as the channel access probability increases. When the 
senders access the channel with small probabilities, the channel is mostly idle but 
senders are not transmitting at rates they need to clear their buffers. Thus the packet 
delay is long. On the other hand, large access probabilities lead to collisions, which 
also cause packet delay and losses. The optimal choice is p = 0.26. The third step 
is to choose the optimal sample period now that we have updated our choice of both 
the MAC protocol and the link design. The third plot compares the system perfor- 
mance as the sample period varies. The optimal sample period is T = 6 msec. Then 
we go back to the first step. The fourth plot shows the control performance versus 
different link designs for T = 6 and p = 0.26. Again, QPSK with (15,11) is the 
best link design. The algorithm converges in the next step when p = 0.26 is again 
optimal within the class of RA with ACK protocols. Even though the Hz norm of 
the control system would have been as large as 11.3 if we had chosen QPSK with 
48-bit frame and no error correction coding, the performance gain in this case is not 
significant. Any sensible choice of the link design gives a reasonable performance. 
This is quite different from what we have studied in our previous work [6], where a 
different controller is used. The cross-layer design result in [6] is shown in Fig. 8.10. 
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Fig. 8.10. Cross layer design: the link, MAC and application layer. 
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There are 18 data points on the first graph in Fig. 8.10 but only 8 are visible. That 
is because the other 10 link designs lead to system instability and thus the Hz norm is 
infinite. In this example, all links with a 48-bit frame size when BPSK is used make 
the system unstable. This is because collisions lead to a large probability of packet 
loss if the number of retransmissions is small, even with very reliable links. The 
performance gain with this controller design is more dramatic. If we had designed 
these layers separately, we could have chosen a reliable link design with BPSK, 48- 
bit frames and strong error correction coding, which leads to system instability. 

The controller discussed in this chapter gives a strict performance improvement 
for all the network design choices. For example, we compare the performance of the 
two control algorithms with a few link designs in the table below. The sample period 
is fixed at T = 5 msec. We see that the near optimal controller design discussed in 
this chapter performs strictly better (smaller Hə norm) than the heuristic controller 
design in [6]. 























Codebook] Modulation} Heuristic | Near Optimal 
(7, 7) BPSK 7.0302 6.7747 
(15,15) |BPSK (oe) 5.7703 
(31,16) |BPSK oe) 5.8433 
(31,11) |BPSK oe) 5.9155 
(31,16) |QPSK 5.8761 5:5315 
(31,11) [QPSK 5.7166 5.5169 


























The difference in the performance gain of the cross-layer design may be ex- 
plained by the different controller choices. In Fig. 8.9, we use a near-optimal con- 
troller design, which adapts to the random delays and packet losses. The Kalman 
filter gives a near optimal state estimate which adapts to the sensor measurement 
losses. The state feedback controller adapts to random delays and packet losses in 
the control command optimally. In Fig. 8.10, the controller design is a heuristic and 
is not adaptive to partial measurement losses at the state estimator. The improvement 
in the control design gives a much broader stability region in terms of the network 
throughput. Yet the optimal controller design seems to take away some of the poten- 
tial gains of the cross-layer network optimization. Of course, our cross-layer design 
is only optimizing a small set of network variables and we restrict all the links to 
have the same design. A full-scale cross-layer optimization is expected to give much 
more significant performance gains for the optimal controller design discussed in 
this chapter. 


8.6 Conclusions and Discussion 


We propose a cross-layer framework for joint design of distributed control and 
wireless networks. The network design goal is to optimize the control performance, 
which is an implicit function of the network performance. Similarly, control design 
choices impact network performance, which in turn impacts controller performance. 
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Thus a joint design of control and communication is necessary. Cross-layer design 
provides a broad framework where each layer of the network protocol stack, includ- 
ing the controller design, can be optimized relative to the end-to-end performance. 

The goal of cross-layer design is to provide the best end-to-end performance of 
the application. In a distributed control system, the control system is the application 
of the network. Thus cross-layer design also includes designing control algorithms 
that are adaptive and robust to the network performance. We extend the separation 
principle for controllers with packet losses and find the optimal LQG controller to 
be composed of a Kalman filter and a delay dependent state feedback controller. 
The Kalman filter is an extended version of the classical Kalman filter since the 
observations sent to the Kalman filter can be lost during transmission. 

We then consider an iterative cross-layer design over a subset of the network 
layers with an optimal controller as the application layer. Such an iterative design can 
give substantial performance gains for certain controller designs. We also uncover 
some surprising insights. In particular, we show that an uncoded link design, which is 
often undesirable due to its unreliability, can be optimal under certain circumstances 
since it achieves the optimal tradeoff among data resolution, time delay and packet 
loss probability. Note that an iterative design is only suboptimal. A true joint design 
over all the network layers should give more significant performance gains. 

An intriguing question to ask is if there is separation between the control de- 
sign and the communication design. The control performance index is a complicated 
function of both the control design and the communication design. Determining the 
optimality in the separate designs is therefore a challenging problem. 

The problem becomes even more difficult when we have a fast fading channel, 
where the performance provided by the wireless links is time varying. In a fast fad- 
ing environment, the probabilistic performance provided by the network is no longer 
stationary. There is a lack of theory in evaluating and designing such systems. A net- 
work control system on the move, such as an Automated Highway System, needs to 
take the time-varying channel into account. Adaptive link layer techniques are com- 
monly used to compensate for the links. In a cross-layer design, the MAC protocol, 
the routing algorithm, and the controller designs must adapt to the channel states 
as well. This area is just beginning to be explored. An important question to ask in 
this adaptive cross-layer design is what parameters shall be shared among different 
layers of the network and how each layer can be made robust to changing network 
conditions. 
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9.1 Background 


Increasing stream traffic on the Internet and higher performance expectations 
from the users of services built on the elastic traffic model have meant that the net- 
work should provide more predictable performance in terms of delays and loss of 
packets. Further, with increasing variety in applications and in users, the network 
should also provide different grades of service. Since the supply of network re- 
sources, e.g., buffer and bandwidth, is fixed over small timescales, predictability of 
performance requires that the network be capable of controlling the behavior of the 
demand for these network resources, especially during times of congestion when the 
instantaneous demand exceeds supply. Allocating network resources among compet- 
ing demands during periods of congestion is essentially a conflict resolution problem. 
In this chapter we consider pricing of network resource usage to resolve this. Appro- 
priate usage and congestion based pricing will provide incentives that influence the 
users’ behavior and their price sensitivity will elicit their true requirement. 


9.2 Approaches to Network Pricing 


Pricing in multi-service packet networks attracted very little research till the 
1990s, possibly because of a lack of economic background in the networking com- 
munity [54]. An excellent introduction to Network Economics is available in [54], 
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possibly the only networking textbook to devote a chapter for this purpose. The IN- 
DEX (INternet DEmand EXperiment) [2, 13] experiment was a pioneering attempt 
to study the controlling of Internet demand through congestion pricing. This is also 
possibly the only experimental study of pricing Internet service. 

A simple approach to network pricing for QoS would be to provide different 
types of service by having a separate queue for each service type and service them 
according to a discipline that provides the requisite QoS. The Paris Metro pricing 
(PMP) [41] is just such a scheme in which service in the different queues are priced 
differently. [41] suggests that each of the queues be serviced at the same rate much 
like in the Paris Metro of yore and the ‘locals’ of Mumbai today. Although no guar- 
antees are offered, relative QoS is provided in the high-priced part of the network 
because high prices tend to keep the average arrival rate lower and hence the traf- 
fic that chooses a high-price service will, on the average, experience less congestion 
than those that choose the low-priced service. [17] and [23] assess the viability of 
the PMP pricing scheme for the service provider. [11] describes an adaptive pricing 
scheme to maximize the network revenue and also analyzes the social optimality of 
the greedy policy in a PMP-like system that is called the “Tirupati-queue’. [52, 53] 
obtain the stability criteria, the revenue rate and mean delays in the PMP and the 
Tirupati systems when the customers adopt a ‘join minimum cost queue’ policy and 
the prices are static. [43] considers the problem of pricing service classes to force 
users to choose the nominal grade of service that they are assigned. 

Different grades of QoS can also be provided through a priority mechanism. 
[9, 18] describe a priority-based pricing mechanism. Here the users indicate the value 
of their traffic by selecting a priority level and during congestion periods, low priority 
traffic is delayed or even dropped. The impact of priority pricing on QoS for a typical 
user and congestion is studied in [18, 19]. 

See [14] and [10] for a nice survey on some early pricing schemes for QoS- 
enabled networks. 

Much of the rest of the network pricing literature is aimed at developing a theory 
for network pricing which has been addressed from many perspectives. In the rest of 
this section we first provide an overview of two of these approaches—(1) welfare- 
maximization based pricing and (2) game-theoretic schemes. The network and the 
users define utility functions on the price of service and the congestion experienced 
by the service. In the welfare-maximization schemes the users adjust their resource 
usages and the network adjusts the prices to maximize their respective utility func- 
tions. In the latter approach, the objective is to converge towards an equilibrium that 
is a good operating point for the network. We also provide an overview of the regu- 
lation network pricing that we propose in this chapter. 


9.2.1 Welfare maximization methods 


The network and the users are assumed to have well defined utility functions 
that characterize their resource requirements and their willingness to pay per unit re- 
source. Typically, the network is assumed to not know the individual utility functions 
of the users. The QoS received by the user from the network is assumed to be built 
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into the resource usage and hence the utility function, like for example, by translating 
the desired QoS into an effective bandwidth requirement. The goal of the network 
is to allocate resources to maximize a network objective function that depends on 
the users’ utilities, e.g., the total utility over all the users which is also called ‘social 
welfare’. To achieve this goal, the network uses pricing as a means to obtain implicit 
information about the users’ utility functions and to allocate resources accordingly. 
A typical scheme of this kind would be of tatonnement type where the optimization 
problem of the user is the dual of that of the network. 

A considerable amount of literature based on this approach is devoted to rate 
control where the network adjusts the price per unit bandwidth along the routes ac- 
cording to the demand and the users respond by adjusting their transmission rates to 
optimize their utility function. One of the earliest works of this type is [38]. Users 
are differentiated based on their traffic characteristics and the maximum end-to-end 
delay they are willing to experience. Users purchase bandwidth and buffers from 
the network to satisfy their QoS requirements. The network and the users iteratively 
adjust respectively their prices and demand to converge to an optimum allocation 
among the users. A ‘proportionally fair’ bandwidth allocation in which the network 
resource allocation is proportional to the users willingness to pay is proposed in 
[25]. [26] proposes a simple rate control mechanism where the network adjusts the 
price per unit bandwidth on the routes and the users choose a transmission rate to 
maximize their utility function. It is shown that in equilibrium, the resulting scheme 
maximizes the social welfare and achieves a proportionally fair allocation of the 
network resources. In [25, 26], the overall objective is decomposed into separate op- 
timization subproblems for the network and for the users, where each user chooses 
a willingness to pay and the network allocates rates to these sources in a way that is 
proportionally fair. Thus in this approach, the users decide their payment and receive 
what the network allocates. In [36] the users decide their resource usage and pay 
what the network charges. 

Welfare maximization mechanisms for fair allocation of resources have also been 
applied to window based congestion control algorithms. [31, 33] describes a family 
of window-based allocation schemes that can be used to achieve a proportionally fair 
allocation or approximate a max-min fair allocation arbitrarily closely. This scheme 
was motivated by TCP, the Internet’s congestion control protocol. [16] describes 
another marking scheme for TCP packets to achieve proportional fairness. See [37] 
for a recent survey on the design of fair congestion control protocols for the Internet. 

In an interesting study, [45, 46] propose a dynamic programming formulation of 
the revenue and welfare maximization problem and show that the performance of 
an optimal dynamic pricing strategy is closely matched by a suitably chosen static 
price, which does not depend on instantaneous congestion. Other works in this genre 
include a study about the dynamics of congestion pricing in [15]. Some of the theory 
developed for wireline networks is now being applied to transmission scheduling in 
wireless networks, e.g., [40, 47]. 


140 D. Garg, V.S. Borkar and D. Manjunath 
9.2.2 Game theoretic approach 


There are many game theoretic formulations for rate allocation and pricing of 
bandwidth usage. Much of the work in this direction considers routing of the traf- 
fic and derives the conditions for the existence and uniqueness of an equilibrium 
[1, 24, 27, 30, 32, 42]. The existence of equilibria allows the design of network man- 
agement policies that induce efficient equilibria by penalizing the users who deviate 
from the equilibrium. In a recent work [28], QoS routing is considered where an 
arriving connection finds the route that minimizes the total cost while satisfying its 
QoS constraints, e.g., maximum end-to-end delay. The cost of reserving a unit rate 
over a link is a function of the aggregate rate reserved on the link. The interaction 
among the various connections is modeled as a game and the goal of the network 
provider is to decide link prices in such a way that the operating point coincides with 
the Nash equilibrium of the underlying game. [55] considers game theoretic pricing 
for rate control and shows that the proportionally fair allocation is a Nash bargaining 
solution. In [3], the interaction between the service provider and the users is modeled 
as a Stackelberg game where users are the followers and the service provider is the 
leader whose goal is to maximize revenue. 

[39] models a priority based network as a non-cooperative game between users 
and the network and shows that a unique equilibrium exists for this game and that 
the bandwidth allocation in equilibrium is weighted max-min fair. [20] considers 
pricing by multiple service providers as a non-cooperative Nash game and shows 
that the equilibrium pricing may be unfair and inefficient. 


9.2.3 Other work 


Providing QoS guarantees to competing traffic classes using a methodology 
based on economic models is considered in [48]. There is also a significant amount of 
work on auction based methods. Auction mechanisms have been suggested in [34]. 
More recently, [49] describes a smart market based mechanism for pricing in a Diff- 
Serv based network in which the users bid for the service and the network assigns 
the quality to the users based on their bids. 


9.2.4 ‘Regulation’ based pricing: a preview 


In this chapter we consider a different approach to pricing of network services to 
provide QoS to users. We propose a simple dynamic pricing scheme for differenti- 
ated service that is based on a ‘regulation’ viewpoint. In this scheme we assume that a 
nominal (‘ideal’) profile for the resource utilization in each of the different grades of 
service is specified a priori. We dynamically adjust the prices for the different grades 
of service so as to modulate user behavior in a manner that drives resource utiliza- 
tion towards this nominal profile. Service grade guarantees are provided through the 
nominal profile prescribed for each grade of service. This scheme was originally 
proposed for a single service station in [7]. We now extend it to the network case. 
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Each link in the network offers multiple grades of service, much like the per hop 
behavior of DiffServ [5]. A set of routes are defined on the network. A route r is 
defined as a sequence of ny 2-tuples of the form (i, j) where i is a link on the route 
and j is the service grade for that route on link 7. As mentioned above, the nominal 
congestion levels in our scheme are prescribed a priori for each of the routes. These 
can be arrived at as an offline static optimization problem (two possibilities are men- 
tioned later) or otherwise. Arriving packets are allowed to choose any of the routes 
that are available. The choice of a route implicitly chooses the grade of service on 
each link. 

We can see that our pricing scheme is inherently a secondary pricing scheme, 
i.e., it is a scheme for stabilizing the desired resource utilization variables around 
values separately arrived at through another primary scheme. This is in the spirit of 
regulation problems in optimal control where one tries to control a system so as to 
make it track a predetermined trajectory. 

The scheme supports multiclass traffic with each class having its own utility func- 
tion for choosing the route, and hence the service type, based on the grade of service, 
current congestion levels and the prices and/or any other criteria. This function could 
take any form (but must satisfy some properties to be enumerated later) and the net- 
work does not have to know this utility function. As mentioned above, the nominal 
congestion levels in our scheme are prescribed a priori. 

We will argue later that a link state protocol can be used to exchange the conges- 
tion and pricing information required by the pricing scheme. We will also see that 
the ‘regulation’ based pricing scheme has the following desirable characteristics. 


e It provides well defined guarantees, albeit statistical, on the grades of service. 

e Jt supports congestion control and traffic management. 

e [tis economically efficient and elicits the true behavior of users. 

e Itis simple to implement, requires minimal measurement, and is compatible with 
current technologies and proposed standards. 


This work is admittedly at a ‘proof of concept’ level and will need some refine- 
ment before it can be converted into a realistic scheme for large networks. We will 
discuss some of these issues. 

The rest of the chapter is organized as follows. Section 9.3 describes the network 
and the user model. In Section 9.4 we describe the pricing scheme and present its 
analysis. We also describe some variations on the basic scheme that are expected 
to improve the system performance. Section 9.5 describes some of the simulation 
results. Section 9.6 discusses issues in the implementation of the scheme where we 
describe methods to calculate the operating point and also to communicate the con- 
gestion information. We conclude in Section 9.7. 


9.3 Network and User Model 


We consider a packet communication network of N links with link 7 having 
a transmission capacity of u; and providing J; grades of service to the packets 
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that are to be transmitted on the link. A separate logical queue may be main- 
tained for packets corresponding to each grade of service. Define a route r as a 
sequence of n, links and the grade of service on each of these links, i.e., route 
r := [(i1, 91), (i2, j2), -- - , (inn, Jn, )| where 71,...,%n, are the links used by route r 
and j1,..-,Jn, are the service grades used on the respective links. Let R denote the 
set of routes with |R| = K. 

The J; queues of link ¿ can be serviced according to an arbitrary policy that is 
designed to provide the required QoS, e.g., strict round robin, weighted fair queueing 
(WFQ) or any of their variants. 

Let y;(t) := [yia(t),---,¥i,a,(t)], yi,j(t) E R denote the state of link i at time 
t with increasing y;,;(t) denoting increasing demand (congestion or utilization) for 
service grade j of link i. Let y(t) := [yi(t), yo(t),.--,Yn(t)] be the state of the 
network at time t. Although we allow y(t) to be any real vector that will be defined 
by the policy in which the multi-class queues are served, it is logical to let y;,;(t) be 
either buffer occupancy (expressed as the number of bits or number of packets) or the 
total arrival rate into the class. We assume the existence of a ‘natural’ upper bound 
bi j on y(t), j = 1,2,..., Ji, i = 1,2,..., N (e.g., the buffer size if y; ;(t) is the 
queue length, or the capacity allocated to grade j on link i if y; ;(t) is utilization). 
In the sequel, we assume y;,;(t) to be the buffer occupancy and describe the model 
around this assumption. The description can be easily modified to suit other measures 
of congestion. 

We assume that the congestion contribution of the routes using service grade j 
on link i combine in some way to yield y;,;(t). Let z7 ;(t) be the contribution of 
route r towards the congestion in service grade j of link 7 at time t. Then, for t > 0, 
ge lgh andi =1,.5.,N, 


bij > veg) = big (it) -a 250). (9.1) 


yi,;(t) will be independent of Zi j if service grade j is not used by route r on link 2. 
Of course, (-) has to be increasing in z j- In the simplest model we could assume 
that contributions are additive and have y;,;(t) = } per, ; %i,j(t) where Ri j is the 
set of routes using service grade j on link 7. 

Let z” (t) € R denote the congestion on route r at time t with increasing 2” (t) 
denoting increasing congestion, and hence demand. We define 2” (t) as a function of 


its contribution on the links of the route, i.e., 


oe) =, (4.4 Ch N ©) . (9.2) 
w,(-) could be defined so as to prescribe the average of the total number of pack- 
ets in the network (z"(t) = zi (t) +--+ zij, (t), keep the maximum 


of the congestion on the links due to this route at a prescribed level (z"(t) = 


max{2j, ji (t) -o 27, jn, (E) D or maintain a minimum average congestion on the 


links on the route (2” (t) = min{z7, (t) o 2%, 3, (t) )- 


Let z(t) = [z1(t), z?(t),..., 2 (t)] be a vector representing the state of all the 
routes of the network at time t. 
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To be able to provide a required level of QoS, the network service provider fixes 
an operating point y* for the service classes and an operating point z* for the routes, 
where 


Yi = DRE seaiy Yi gl 


y = [Ys ++ +s Uns 
oh ee ce), (9.3) 


The price per unit traffic on route r at time ¢ is denoted by p” (t). Define p(t) := 
[p'(t),..-,p*(t)] to be the network price vector at time t. We assume that p(t) is 
posted by the service provider and is available at the users. p(t) depends on z(t) 
and it is adapted so as to make the congestion level on the routes around the desired 
operating point z*. y* does not directly enter our adaptation scheme but enters some 
of the mechanisms that we suggest for choosing z*. It can also be used by the traffic 
in defining their utility functions. 

We next describe the model for the user process. As we have mentioned ear- 
lier we allow multiclass traffic. Users of class s are assumed to incur a cost of 
CY (x, 2" (t), p” (t)) when it injects x units of traffic into route r at time t and this 
cost is increasing in 2” (t) and p’(t). An example cost function can be of the form 
CT (x, 2" (t), p" (t)) = ap" (t) — UP (a, 2"(t)) where U(x, z”(t)) is the utility of 
sending « units of traffic on route r when the congestion level is z” (t). 

Following [7] two user models can be defined—a small user model and a large 
user model. Consider a user of class s that needs to send x(t) units of traffic at time 
t. Let Rs C R be the set of routes that class s can use. In the small user model the 
user will assign all the x, units to the single route from among those that it can use 
that minimizes its cost. For the example cost function defined above, the user will 
allocate its traffic to the route r obtained as 


arg minper, {p"(t)es(t) — Us (x(t), 27 (t))} - 


In the large user model the source will partition x(t) among usable routes so as 
to minimize its cost. For the cost function that we give above, a large user with x,(t) 
units of traffic will send x” (t) to route r, so as to maximize 


>) (2,2 ©) -p Oxi} 


rEeRs 


subject to er, x(t) = x(t). Here, we have assumed that utility functions are 
additive over the routes. 


9.4 The Price Adaptation 


The aim of the price adaptation scheme is to adapt the route price so as to keep 
the traffic on the route close to the prescribed nominal profile z*. As can be seen 
from our description of the users, we assume that the users behave in an individually 
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optimal manner with respect to their respective utility functions and choose the route 
that minimizes their cost (or conversely maximizes their utility). The ‘price setter’ for 
route r uses z"™* and z"(t) and adjusts the price to achieve the regulation objective. 
The price per unit volume of the traffic on route r and the congestion on it, p” (t) and 
z” (t) respectively, are communicated to the user population with possibly non-zero 
delays. 

A simplistic way of regulating the congestion on route r to z”* would be to 
use zero price when z”(t) < z™ and an extremely high price when 2” (t) > 2”. 
This clearly leads to high fluctuations in the arrival process to the routes and can 
cause severe packet losses. To ensure a graceful adaptation of the traffic process into 
the routes, we use the following price adaptation equation. Let t1, t2,t3,... be the 
epochs at which the price is adapted. 


plti) =T (PEHEN ti) Vr ER. 0A 


Here a > 0 is a small scalar called the ‘learning parameter’ and T (-) is the projection 
onto the interval [7, M], n, M,> 0, with 7 a small number and M a very large 
number. The projection I°(-) puts an upper bound on the price for a route and also 
prevents the scheme from getting stuck at zero, i.e., T(x) = min(max(a,7), M). 
The intuition behind the adaptation equation in (9.4) is simple—decrease the price if 
the congestion on a route falls below the nominal and vice versa. 

We call Equation 9.4 the linear deviation scheme because (z"(ti41) — 2”) is 
the deviation of the current congestion from the nominal value and the adaptation 
equation is linear in the deviation. 


9.4.1 Analysis 


We sketch briefly the theoretical underpinnings of our scheme, along the lines of 
[7]. We assume the following: If the price vector was frozen at p = [pi,..., pK], 
then the queue length process in the routes on the queues, and hence the congestion 
on the routes, is asymptotically stationary ergodic. Let E,,| - | denote the correspond- 
ing stationary average. Define h(-) = [hi(-),-..,he()]7 : (Rt) = (R)* by 
hr(p) := pr(Eplzr(t)] — 2%) Vr € R. Consider the ‘limiting o.d.e.’ for our algo- 
rithm: 

a(t) = h(q(t)). 0.5) 


We shall need the following assumptions. 


(A1) h(-) is continuously differentiable. 
(A2) h,(p) > 0 when p, = 7 and < 0 when p, = M forl<r< K. 
(A3) gH >0fori# jandi, j=1,..., K. 


(A1) implies that (9.5) is locally well-posed. It will hold, e.g., when the stationary 
law of {z(t)} depends smoothly on p when the latter is kept fixed, which is a mild 
requirement. The first half of (A2) means that when the price for a route is very low 
(to be precise, = n), it will attract traffic, in turn pushing the prices up. The second 
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half means that high prices will correspondingly lead to low traffic into that route, 
hence lower congestion, low enough to warrant a drop in the prices. (A3) implies that 
an increase in the price of one route tends to drive arrivals to the others and increase 
the congestion levels there. Both make eminent economic sense and will hold, for 
example, when the users have a cost function that is increasing in prices and join the 
route that minimizes the cost, and either a cheap alternative is always available or 
balking occurs. The presence of the projection operator T (-) ensures that the iterates 
remain bounded. 

Standard results from the ‘o.d.e. approach to stochastic approximation algo- 
rithms [29] imply that the algorithm in (9.4) will asymptotically track (9.5) in the 
following sense: Define p(t), t > 0, by 


P(na) = p(n), n> 0, 


with linear interpolation on each interval [na, (n + 1)a]. Then for any T > 0, 


jim E| sup _ |lp(t) — po(t)|I"] = O(a), (9.6) 
OFS tE [toto +T] 


where fo(-) is a solution of (9.5) on [to,0o) with fo(to) = p(to). See, e.g., [8]. 
Thus (9.4) is a ‘noisy discretization’ of (9.5) with O(a) error. The projection T (-) 
would lead to a boundary correction in the o.d.e. limit if the driving vector field were 
directed outwards at any point in the boundary. This follows from the explicit expres- 
sion for this correction term in, e.g., [29], Section 5.1. In our case such a correction 
term is missing in the o.d.e. limit, because the driving vector field of (9.5) will be 
directed inwards at the boundary of the region [7, M]*, thanks to (A2). This last ob- 
servation also implies in particular that the associated flow maps this region into itself 
and hence by the Brouwer fixed point theorem [44], has at least one fixed point, i.e., 
(9.5) has at least one equilibrium in this region. Absence of (A2) will allow bound- 
ary equilibria which is not a problem, except that they warrant a messier analysis 
involving boundary correction. These, however, may not be the desired equilibria, in 
the sense that they may not correspond to the target z*. The idea here is that 7, M, z* 
should be arrived at from known traffic characteristics, so that (A2) is reasonable. 
If the input traffic rate is either too high or too low, it is no longer so. Mathemati- 
cally, the assumptions (A1)—(A3) qualify (9.5) as a cooperative o.d.e. in the sense of 
[21, 50]. Thus we have: 


Lemma 1. (a) For generic initial conditions (i.e., for initial conditions belonging to 
an open dense set), q(-) converges to the set of equilibria H of (9.5) (though not 
necessarily to a single point thereof). 

(b) If the inequalities in (A3) above are strict, then for generic initial conditions, q(-) 
converges to one of the equilibria in H, depending on the initial condition. 


These are minor variations of the results in the references above as observed in 
[7]. The boundedness of trajectories required in these results of [50] is ensured by 
(A2). For later reference, we denote by G the open dense set of initial conditions 
for which the conclusions of the lemma hold. The condition in (b) above can be 


relaxed to the requirement that the Jacobian matrix Dh() 4 lene (p)]hi<ij<« be 
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irreducible at every point. Also, both (a) and (b) will mean point convergence if H 
is discrete. Note also that H # ¢ is a part of the conclusion of Lemma 1. We now 
seek conditions for H to be discrete. Consider the condition: 


(B1) The Jacobian matrix Dh(x) is non-singular at every point in H. 


This condition is not too restrictive: Define h(-) := E,(z(t)). By Sard’s theorem 
([44], p. 130), Dh(p*), and therefore Dh(p*), is non-singular for almost all choices 
of z*, i.e., for all z* outside a set of zero Lebesgue measure. (Some care has to be 
taken in applying this argument when natural constraints force the range of h to be 
strictly a subset of RŽ, see [7].) By the inverse function theorem [51], the zeros of h, 
i.e., the equilibria of (9.5) are isolated under (B1). Structural stability considerations 
[22] suggest that it is reasonable to suppose that these equilibria are hyperbolic. Let 
S denote the (discrete) set of stable equilibria. Then the considerations above make 
the following a reasonable assumption: 


(B2) There exists an open dense set U such that all trajectories of (9.5) initiated in U 
converge to some point in S. 


Intuitively, we want U® to include the initial conditions excluded by Lemma 1 
and the equilibria not in S and their stable manifolds, i.e., U will be the union of the 
domains of attraction of the stable equilibria of the o.d.e. We also assume: 


(B3) {p(t)} is asymptotically stationary and the stationary law va of p(t) (where ‘a’ 
is the stepsize in (9.4)) has a density w.r.t. the Lebesgue measure on R. 


The stationarity would be true, e.g., if the process z(-) were a function of a pos- 
itive recurrent Markov chain, a common situation. The Lebesgue continuity of the 
stationary distribution is an assumption. If it were not true, one could enforce it 
by adding additional Lebesgue-continuous i.i.d. noise to the r.h.s. of our iteration. 
In our simulations, the intrinsic randomness of the system seemed to suffice. Let 
Bie) := {p E€ RË : infpres ||p — p*|| < €} (Le., the e—neighborhood of 9) and set 


T(x) = sup{t > 0: q(t) ¢ Ble)} 
= inf{t >0:¢q(s) E€ Ble) Y s >t}, 


where q(-) satisfies (9.5) with initial condition x. T(x) is the least time such that 
the trajectory starting at x lies in B(e) thereafter. (Thus T(x) < oo for x € U.) 
Conditions (B2), (B3) imply in particular that va(U,,{@ : T(x) < n}) = 1. Thus 
given any ô > 0, we can pick an N(6) > 1 such that va({x : T(x) < N(d)}) > 
1 — ô. Consider a stationary process {p(t)} governed by (9.4) and the corresponding 
equation (9.5) with tọ = 0 and T = N (ô)a. Then it is clear that 


va(B(e)) = P(p(T) € B(e + O(a))) > 1 — ô. (9.7) 
We have proved (see also [4]): 


Theorem 1. The stationary distribution Va of (9.4) concentrates on the stable equi- 
libria of (9.5) in the sense of (9.7). 
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Note that any equilibrium point p of (9.5) corresponds to the desired equality 
z* = Ep|z(n)], so they are all equivalent for our purposes. 


Remark: As in [7], one has the following sufficient conditions for (9.5) to have a 
unique equilibrium which is asymptotically stable,: 
OG: OG: 
= 
s(n) > [sec 


jzi Pi 








, for route indices i,j = 1,..., K, (9.8) 











where g(p) := p + h(p) Y p € RE. This means that the congestion in the ith route 
will be much more sensitive to its own price changes than to the price changes in 
other competing routes. 

We shall now describe qualitatively what to expect from the price adaptation 
scheme for the various, progressively more restrictive scenarios described above. In 
absence of (B1)-(B3), since (9.5) still converges to the equilibrium set S ‘generi- 
cally’, we can expect the prices to wander in a neighborhood of this set with high 
probability. Note, however, that any price vector in S will lead to average congestion 
being z*. Thus, while the prices may wander, one can expect the average congestion 
profile to fluctuate around z* as desired. The situation when (B1)-(B3) hold is better. 
Results of [12] suggest that typically the adaptation scheme will spend large times 
in the neighborhood of one of the isolated equilibria in S with rare transitions from 
one to the other. Under (9.8), of course, we expect the adaptation scheme to remain 
in a neighborhood of p* most of the time. 

We remark here that our control scheme allows for occasional resets. To motivate 
the need for such a feature, consider a network in which there has been a link failure 
or capacity augmentation causing z* to be changed. In this case the system may be 
reset and the queue lengths will begin to adapt to the new operating point. Finally, 
there may be delays in obtaining and processing price information due to physical 
transmission delays and/or due to the need for reducing the overhead imposed by this 
feedback. However, arguing along the lines of [6], the effect of delay in the analysis 
of the stochastic approximation scheme above can be shown merely to add an addi- 
tional O(a) error (Recall that the ‘error’ |p" (t) — p” (t — T)| for a bounded random 
delay 7 is O(aT), where T is any bound on 7). Hence we ignore it. Simulations 
validate this observation. 


9.4.2 Alternative price adaptation equations 


The following alternative schemes can also be used. 


_ 


. Linear Relative Deviation Scheme: This scheme is motivated by the argument 
that the marginal price per unit deviation in the route congestion should be higher 
for the route which is targeted to operate at lower congestion level. Thus we 
replace (z"(t;41) — 2” ) in the linear-absolute scheme by (2). The 
adaptation equation under this scheme will be 

2" (ti41) = a 


P (tigi) =P (or) + ap” (ti) ( rc )) YrER. (9.9) 
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Fig. 9.1. A four-node, five-link network. Each queue is numbered to identify the routes. Even 
numbered queues are the QoS queues and the odd numbered queues are non-QoS queues. 


2. Quadratic Deviation Scheme: The squares of the actual and nominal traffic are 
used to calculate the deviation in the adaptation equation. This will enhance the 
errors and hence provide higher marginal prices in the upward swings (which 
is of more concern because they may cause packet drops) than the downward 
swings. The price adaptation equation will be of the form 


Ptaa) =P (p"(t) + ap" (ti) (2 (ti)? z eT) YreR. (9.10) 


3. Quadratic Relative Deviation Scheme: This is the quadratic version of the linear- 
relative scheme. 


p (ti) =T (ra + ap" (ti) py EFN) YreER. (9.11) 





Other variations are also possible. We could calculate the average deviations and 
the relative deviations over small moving time-windows or shifting time-windows. 
We could also mandate that a price change be committed only if the price change (rel- 
ative or absolute) exceeds a specified threshold. We could also discretize the prices. 


9.5 Simulation Results 


We describe the results from our simulations of a four-node, five-link network 
shown in Fig. 9.1. Note that the links are directional. Each link maintains two classes 
of queues—class 0 for QoS traffic where the queue length will be maintained at 2 
packets for each route and class | is a non-QoS class with no price for service. Buffer 
capacity of class 0 queues is 8 while that of class 1 queues is 20. The routes are 
shown in Table 9.1. Propagation delays and the communication delays in conveying 
the prices are assumed to be zero. Interarrival times are i.i.d. hyperexponential (an 
equal mixture of exponentials of rate 1.5 and 0.5) and service times are constant. The 
small user model is considered in which an arrival of class a at time t chooses the 
route for which ap” (t) + (1 — a) >; jer yi,j(t) is minimum from among those that 
it can take to reach its destination, œ is uniformly distributed in (0,1). For example, 
a packet from A to D has a choice of four routes while that from A to B has only two 


Time Average of Z(t} > Time Average of Z(t} 


Time Average of Z(t} 
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Fig. 9.2. Long term congestion. 
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Srce-Dstn Routes 

A — B |[(1,0)] (queue 0); [(1, 1)] (queue 1) 

A — D |[(1,0), (2,0), (3, 0)] (queues 0, 2 & 4), [(1, 1), (2, 1), (3, 1)] (queues 1, 3 & 5) 
[(5, 0), (3, 0)] (queues 8 & 4); [(5, 1), (3, 1)] (queues 9 & 5) 

B — A |{(2,0), (3,0), (4, 0)] (queues 2, 4 & 6); [(2, 1), (3, 1), (4, 1)] (queues 3, 5 & 7) 

D — A |[(4,0)] (queue 6); [(4, 1)] (queue 7) 

A — C |[(5,0)] (queue 8); [(5, 1)] (queue 9) 





Table 9.1. The route set for the simulation experiments. The sequence of queues for the routes 
are also given. 


choices. The saaubanond is run for one million time units. As we say above, zj"; = 2 
— rT% Tk 
for all 7, j, r and the z™ = zif j Heie 25% jn 


The congestion data is collected at the a instants. The long term average of 
the congestion at time ¢ is calculated by taking the sample average of the congestion 
seen by arrivals up to time t. Fig. 9.2 shows the long term congestion in the 12 routes 
in the network. Observe that in the routes with controlled queues, the average con- 
gestion approaches the prescribed value fairly quickly and maintains the prescribed 
average. 

The short-term average of the congestion seen by the arrivals is calculated by 
using a shifting-window and obtaining the sample average over the arrivals in the 
window. Fig. 9.3 shows these short term averages for the last 100,000 time units 
of the simulation. Note that it is reasonably close to the prescribed value. Fig. 9.3 
also shows the instantaneous congestion for the last 500 time units of the simulation. 
Observe that there could be some fluctuations in the instantaneous values while the 
averages are maintained at the prescribed value. 

We also show the instantaneous queue lengths of the ten link-queues in the net- 
work for the last 1000 time units of the simulation. Observe that the queues do satu- 
rate and packets get dropped in both the controlled and the uncontrolled queues. 

Recall from our discussion earlier that communication delays cause an O(aT) 
error where T is an upper bound on the delay. To verify this, in [7] extensive simula- 
tions were carried out for the single link case with communication delays in convey- 
ing both the price and the congestion information and it was found that the adaptation 
scheme works well. We don’t expect this would change in the network case. 


9.6 Implementation Issues 
9.6.1 Choosing the operating point 


We suggest three possible ways to choose the operating point. A simple scheme 
is devised to guarantee a certain average delay on each route. Let d” be the desired 


mean delay on route r. Partition d” into d? ..., d€. ; suchthatd? +--+ 
i1, ji? ?  UnpsJnr 11,91 
kag ca Tr* “a rT y 
dijn, = T and choose z2"™ = d” and z7* jp, = di, jp EE L,- Nr- 
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Fig. 9.3. The left graphs show the average congestion on the route measured using a shifting 
window of 100 units. The right graphs show the instantaneous congestion. 


A second scheme would be to choose z”* to satisfy an upper bound on the packet 
drop probability. Define service grade j on link 7 by a;,; > 0 which is an up- 
per bound on the packet drop probability. This means v (y;,;(t) > bij) < Qij, 
where b; ; is the upper bound on y; ;(¢) and v(-) is the probability measure. Using 
Chebyshev inequality, v (y;,;(t) > bij) may be bounded from above by yj ;/bi,;. 
Therefore, Yj. F < bi j@i j would ensure the desired level of QoS. Further, the oper- 
ating points z* for the routes can be decided by solving the following optimization 
problem: 
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Fig. 9.4. Instantaneous total queue lengths in the ten link-queues in the network. 


max | min z” 
r:rER 


subject to 


(bijai >) So Sige 1 tien 
TERi,j 


Ji; 4=1,2,...,N 


Here R;,; is the set of routes using service grade j on link 7. The idea behind the 
scheme above is to accommodate as much traffic on the routes as we can, while still 
maintaining the prescribed level of QoS at each of the individual queues. 
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A third method to choose the network operating point z™ is as follows. Let c; j 
be the penalty per packet loss from the queue of service grade j on link 7. Then 
the average penalty for this queue would be c; jv (yi i(t) > bij) which is bounded 
from above by Ci, j Yi, : / bi j, and which is further bounded from above by Ci jQi j. 
Since the users are charged for their traffic on each route, the service provider would 
ideally like the routes to operate at point z* = ie oP chee "] that would maxi- 
mize the profit. Thus, the operating point z* can be decided by solving the following 


problem that maximizes the minimum profit of the service provider: 


max {5° nz" -Y (se X zis) 
T ij r 


subject to 


(bijaij 2) uty =>) aly torg = 1,..., J, andi=1,...,N 
Tr 
where 77 is the minimum possible price that the service provider can charge. 


9.6.2 Communicating prices and congestion information 


The instantaneous prices and the congestion parameters of the routes on each of 
the links may be exchanged using a link-state protocol like in OSPF. Simple modifi- 
cations to the standard OSPF protocol along the lines of the QOSPF protocol suffice. 
With the congestion information on all the links available at the ingress link, the 
ingress node can calculate the route congestion and hence choose the route on which 
the packet is to be forwarded. Note that our analysis allows for non-zero delays in 
communicating the congestion and pricing information and the effect is shown to be 
negligible. 


9.7 Discussion 


This chapter leaves plenty of room for further work in several directions. One 
can try out many other variations of the price adaptation scheme and develop heuris- 
tics that are in the same spirit as (9.4). See [7] for more suggestions. The operating 
point could also be selected using other methods, e.g., those based on the arrival 
rate of packets into the route. Of course, we could experiment with different arrival 
processes and service time distributions and user models. Some of these have been 
investigated but not presented here and we have found that the control objective is 
achieved. 

An important variation would be to develop a combined link-route pricing. Intu- 
itively, one would like to assign prices to the 2-tuple of service grade and link rather 
than to routes. The route price should be simply the sum (or a simple function) of 
the prices of the service grades on the links over which the route is defined. A naive 
possibility is to obtain route prices using the adaptation equation of (9.4) for some 
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routes chosen so that if we assume the route prices are sums of link prices, one can 
solve for the latter explicitly in terms of the former. In turn, the prices for the re- 
maining routes can be calculated by simply summing the prices of links that occur in 
the route. Our preliminary numerical experiments with such a scheme did not show 
encouraging results. Nevertheless it is worth exploring further. 

If the number of source-destination pairs is large, the schemes discussed in this 
chapter may not be practical because of the number of routes one needs to define. 
However, note that much of networking literature essentially considers routes, espe- 
cially while considering stream or circuit multiplexed traffic. Also, with the emer- 
gence of MPLS technology and the use of label switched paths which are essentially 
routes, the idea that we present in this chapter does not look so impractical and can 
be implemented at the ingress router of an MPLS domain. Further, we believe that 
the scheme that we present in this chapter can have applications to DiffServ-aware 
MPLS networks as described in [35]. 
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Summary. Many distributed multiple access (MAC) protocols use an exponential backoff 
mechanism. In that mechanism, a node picks a random backoff time uniformly in an inter- 
vals that doubles in size after a collision. When used in an Ad-Hoc network, this backoff 
mechanism is unfair towards nodes in the middle of the network. Indeed, such nodes tend to 
experience more collisions than nodes with fewer neighbors; consequently, they often choose 
larger delays than those other nodes. We propose a different backoff mechanism that achieves 
a fairer allocation of the available bandwidth by decreasing the backoff delay upon collision 
or failure to send a packet. That is, anode becomes more aggressive after each failure. Accord- 
ingly, we call the mechanism the Impatient Backoff Algorithm (IBA). The nodes maintain the 
stability of the algorithm by resetting, in a distributed way, the average backoff delays when 
they become too small. We perform a Markov analysis of the system to prove stability and 
fairness in simple topologies. We also use simulations to study the performance of IBA in ran- 
dom Ad-Hoc networks and compare with an exponential backoff scheme. Results show that 
IBA achieves comparable mean throughput, while delivering significantly better fairness. 


Key words: Fairness, MAC, Ad-Hoc Networks. 


10.1 Introduction 


It has been observed that the widely used exponential backoff mechanism (e.g., 
IEEE 802.11) is unfair towards nodes in the middle of an Ad-Hoc network with mul- 
tiple interference domains (see [1] and [2]). This unfairness results from the higher 
degree of contention that these nodes face compared to nodes at the outer edges. We 
illustrate that unfairness and we propose a new backoff scheme to reduce it. 


10.1.1 Unfairness of exponential backoff 


To demonstrate the unfairness of the exponential backoff, we consider the net- 
work with three links as shown in Fig. 10.1. Transmissions Al-A2 and B1-B2 both 
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interfere with X1-X2, while Al-A2 and B1-B2 do not interfere with each other. Link 
X therefore faces more contentions and is more likely to experience collisions than 
its neighbors. 
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Fig. 10.1. Simple topology demonstrating the unfairness of 802.11. 


We perform a simple experiment with three pairs of laptops located in three dif- 
ferent rooms, as shown in Fig. 10.1, to implement this topology. The laptops use 
802.11b with link rates of 11 Mbps. The table summarizes the achieved rates on 
each link. Note that MAC overhead limits the maximum possible capacity of a link 
to 6M bps. We observe that links A and B can simultaneously achieve the maximum 
6M bps rate. When links A and X are on at once, they share the channel equally, each 
receiving approximately 3Mbps. On the other hand, if A, B and X are all on simul- 
taneously, link X’s throughput drops significantly. This simple experiment indicates 
the unfairness of 802.11b towards nodes in the middle of the network. 

It is easy to see the cause of this unfairness. 802.11b follows a backoff mechanism 
whereby nodes try to capture the channel after waiting for a random backoff selected 
uniformly in an interval. Upon a collision, nodes are required to double the inter- 
val and try again (i.e., try less aggressively). Consider node X above. It faces more 
contention than nodes A and B, and thereby has a lesser chance of success. Conse- 
quently, it collides more often than nodes A and B, and backs off more. As a result, 
node X succeeds much less often — as demonstrated by the results in Fig. 10.1. 
More generally, this phenomenon biases the network against nodes in the middle of 
an Ad-Hoc network. This effect is particularly undesirable because multi-hop routes 
tend to use the middle of the network more often than the outer edges. 

The motivation for this approach is stability: by backing off exponentially fast, 
the rate of transmission attempts decreases quickly even if more nodes become ac- 
tive. Consequently, the likelihood that one node succeeds quickly approaches 1. Of 
course, the backoff delay increases somewhat but remains generally small compared 
to the packet transmission times. When all the nodes share a collision domain, the 
fairness issue does not arise. Thus, such a backoff scheme is suitable for the shared 
collision domain of the Aloha network, the original Ethernet, and typical 802.11 con- 
figurations. As we just discussed, the situation is quite different in Ad-Hoc networks. 


10 Achieving Fairness in a Distributed Ad-Hoc MAC 161 
10.1.2 Impatient backoff algorithm 


We propose a novel backoff mechanism, the Impatient Backoff Algorithm (IBA), 
that attempts to improve the fairness in a distributed MAC algorithm running across 
multiple interference domains. When using IBA, nodes decrease their average back- 
off delay upon collision — thereby becoming more aggressive in attempting to cap- 
ture the next slot. Also, nodes increase their average backoff delay upon successful 
transmission. The danger of the scheme becoming unstable because of frequent col- 
lisions is handled by resetting the average backoff delays when they get too small. 

We use a Markov chain model to show that IBA achieves fairness in simple 
topologies. We demonstrate the stability of the backoff scheme under reasonable as- 
sumptions by proving the positive recurrence of the Markov chain. We also evaluate 
the throughput tradeoff required in order to achieve fairness. 

Our analysis does not take into account collisions that happen due to propagation 
delays and imperfect knowledge of interference. We have built a simulation model 
that captures these effects to study the performance of IBA in an arbitrary topology 
and compare it against traditional 802.1 1-like exponential backoff mechanisms. Sim- 
ulation results demonstrate that IBA is able to maintain a level of mean throughput 
comparable to exponential backoff in a random network — but achieves significantly 
better fairness. We use the simulation model to study the effect of a realistic reset 
mechanism which propagates imperfectly the reset control message hop by hop. We 
also evaluate the variations caused by the values of certain design parameters and 
comment on their choices. 

The chapter is organized as follows. We begin by describing related work in Sec- 
tion 10.2. In Section 10.3, we present the backoff model for IBA, which we evaluate 
analytically in Section 10.4. IBA design parameters and their effects are discussed 
in Section 10.5, while Section 10.6 presents simulation results before we end with 
sections on conclusions and future work. 


10.2 Related Work 


There is a wide body of literature dealing with 802.11 protocols (e.g. [3]), stan- 
dards being available at [4]. Our focus is on the backoff mechanism utilized to handle 
congestion in these networks — and their resulting throughput and fairness. 

Bianchi [5] presents a two-dimensional Markov model of the exponential backoff 
mechanism in 802.11. By assuming that the probability of collision of a node does 
not depend on its own state history, the author is able to derive expressions for the 
packet transmission probability and saturation throughput. Ergen and Varaiya extend 
that work in [6]. Their model incorporates carrier sense, non-saturated traffic and 
SNR, for both basic and RTS/CTS access mechanisms. Analysis of the model shows 
that the throughput first increases, and then decreases with the number of active 
stations. A valuable aspect of this work is the modeling of variable packet lengths 
and the un-slotted nature of the protocol (i.e., no synchronization between nodes). 
Varaiya and co-workers have also looked at related problems in [7], [8] and [9]. 
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Slotted media access protocols have been studied for several decades (see [10]for 
an overview). In particular, a detailed analysis of Slotted Aloha MAC has evaluated 
its throughput and fairness. More recently, Yuan and Marbach [11] have proposed a 
rate control for random access networks. By controlling the rate at which the nodes 
attempt to transmit, by increasing it when idle and decreasing it after collision, the 
system is shown to be stable, i.e., positive recurrent. 

Our model for IBA differs from [5], [6] and also [11] in a crucial aspect — we 
consider Ad-Hoc networks spanning multiple interference domains. Consequently, 
the nodes do not all share the same medium, and so have different degrees of con- 
tention, and therefore different collision probabilities. It is this fact that biases such 
protocols against middle nodes in the network. Addressing this fairness issue is the 
primary goal of this chapter, and leads us to propose the strategy of becoming more 
aggressive upon collision. 


10.3 Backoff Model 


10.3.1 Assumptions 


We make simplifying assumptions on the MAC model. The main assumption is 
that packet transmissions occur in a slotted and synchronized fashion. Each packet 
time slot is divided into two phases: 

1. Backoff Contention Phase. In this phase, each node that has a packet to send 
generates a random backoff value. It waits for these many backoff mini-slots. The 
backoff mini-slots are much much smaller than the packet transmission slot. If it has 
not heard a transmission from a neighboring node while waiting, the node sends out 
a short ‘Slot Capture Message’. All neighbors which hear this slot capture message 
(i.e., within transmission range), or carrier senses noise in the channel (i.e., within 
interference range) will keep quiet for this slot. 

2. Packet Transmission Phase. At the end of the Backoff Contention Phase, all nodes 
which successfully sent out the slot capture messages will transmit a constant sized 
packet. Thus, only the nodes that have generated a backoff delay smaller than or 
equal to those of their neighbors get to transmit a packet. A successful transmission 
is confirmed by an acknowledgement (ack). 

An example is shown in Fig. 10.2, where five nodes in a line contend for the 
channel. Each node is assumed to interfere with neighbors two hops away. The lower 
part of the picture shows one slot. During the Backoff Contention Phase, node A 
chooses a smaller backoff than B and C — consequently it is able to send a Slot 
Capture Message that is heard by B and C, which keep quiet. Nodes D and E are not 
affected by A’s slot capture, and E wins that contention. As a result, nodes A and E 
utilize the Packet Transmission phase in parallel. Although C had a smaller backoff 
than E, C is quiet in this slot, allowing E to transmit. Note that nodes wait till the end 
of the entire Backoff Contention phase, before beginning the Packet Transmission 
phase. 
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Collisions can occur in this scheme if two neighbors choose the same backoff. In 
this case, neither will hear the other’s slot capture message. Hence both nodes will 
try to send packets, and collide, resulting in a wasted packet transmission slot. A 
collision will also occur if a node within interference range is unable to discern the 
slot capture. Note that this scheme does not employ an RTS/CTS mechanism, hence 
hidden terminals [12] will not be accounted for. For instance, the transmissions of B 
and E do collide at C. 
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Fig. 10.2. Two phases of Impatient Backoff. 


Finally, slotted transmission implicitly assumes synchronization between the 
nodes. We do not address the details in this chapter, other than to indicate that the 
level of synchronization required is no more than any standard slotted MAC protocol. 


10.3.2 Exponential random backoff rate 


Traditional backoff mechanisms choose a random backoff uniformly in [0, Bz], 
where By is the backoff limit. The mean backoff in this case is Bz /2. Instead of 
using an uniform random variable, IBA chooses the backoff using an exponential 
random variable with mean Bz /2. The exact number of backoff mini-slots is then 
determined by rounding the random variable and capping it at a maximum value, 
since the exponential random variable is unbounded. 

We utilize the exponential random variable since it offers some useful character- 
istics that we use in the analysis of the scheme in Section 10.4. In particular, when n 
nodes with mean backoffs bı, b2,...,b, contend, node ¿i wins the contention when 
backoff B; < B;Vj + i. The probability of this occurrence is calculated as follows: 
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1/b; 
Z= (1/b;) 


In this calculation, we ignore the probability that the smallest backoff delay falls 
in the same mini-slot as another one. 


P(Node i wins contention) = (10.1) 


10.3.3 Updating average backoff delays 


The key principle of IBA is that nodes that face more contention become more 
aggressive so that they can get their fair share of the channel. This is achieved by 
updating the backoff based on feedback received in the last slot. 

Assume that a node has a mean backoff delay b and has a packet to transmit. If the 
node fails to send in the current slot, either because of a collision or because it loses 
during the contention phase, it decreases its mean backoff delay by a multiplicative 
factor m > 1. On the other hand, if the node transmits successfully, it increases its 
mean backoff delay by the same factor. Note that decreasing the mean backoff delay 
makes a node more likely to win during the next contention phase. 





Upon failure, b := b/m 
Upon success, b := b x m 











10.3.4 Reset 


The obvious problem of an aggressive backoff mechanism is one of collision 
meltdown. When there are a lot of contending nodes, all but one decrease their mean 
backoff delay. This increases the chance that several of them pick a small backoff 
delay in the same mini-slot and hence collide. In order to avoid this situation, we 
propose the IBA Reset mechanism. The idea is to increase the mean backoff delay 
of every node by a constant factor R whenever any node’s mean backoff delay falls 
below a reset limit Rz. A reset does not alter the result of the contention phase, since 
equation 10.1 is unchanged by multiplying every term by Rp. In Section 10.5, we 
discuss the choice of values for Rz and Rr. 

In reality, it is impossible to change the mean backoff delay of all nodes simul- 
taneously; any reset message needs to propagate through the network. In Section 
10.6.3, we simulate a reset propagation scheme, as also the loss of some reset mes- 
sages, and see that their effect on overall throughput and fairness is minimal. 


10.3.5 Fairness 


Qualitatively, we understand fairness to imply that two nodes in similar situations 
ought to get the same share of the bandwidth. For simple networks, it is easy to see 
the “fair” allocation, e.g. in Fig. 10.1, we would like each of the three transmitting 
nodes to get 1/3 of the channel bandwidth. In more complicated and/or random 
topologies, evaluating fairness is a trickier proposition. 
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There are several metrics of fairness in use e.g. Max-Min Fair [10], Proportional 
Fair [13], Jain’s Fairness Index [14], etc. In this chapter, we utilize Jain’s index to 
measure scheduling fairness, and to compare against other MAC schemes. We pre- 
fer Jain’s index [14] since it measures the fairness in terms of an optimal desired 
throughput. Let the measured throughput be Z;,..., Zn and the desired throughput 
be Dy,..., Dn. Define x; = Zi Yi = 1...n. Then the fairness F is given by 


=å, 
_ (Sai) 
Pa oye 








(10.2) 


10.4 Analysis 


We first evaluate IBA in some simple topologies that are amenable to analysis. 
We especially consider the star topology shown in Fig. 10.3(a). We also analyze the 
triangle clique topology in which three nodes all interfere with each other. 


10.4.1 Star topology 


The star topology is characterized by a middle node X that interferes with every 
other node, while the outer nodes do not interfere with each other. The figure shows 
a star with 4 arms, although we analyze a star with n outer nodes. Recall that we are 
modeling a non-RTS/CTS scheme. So we are concerned only with the interference 
between nodes as they transmit and need not consider the location of the receivers. 

The star topology is of special interest in terms of fairness, since node X contends 
with every other node in the network and traditional backoff mechanisms are unfair 
towards it. We want to ensure that our scheme is fair for this simple topology. 

When node X wins the backoff contention, it is the only one that transmits. How- 
ever, when any other node wins the contention, node X has to keep quiet and so the 
remaining outer nodes believe they won the round and they all transmit. Thus, as- 
suming that the nodes always have a packet to send, the mean backoff delays of all 
the outer nodes remain equal. 

Let the mean backoff of the middle node be bı and that of each of the n outer 
nodes be b2. Since we use exponential random variables to generate the backoffs, the 
probability that the middle node wins the contention, as in equation (10.1), is given 
i å i 

tx = i = I, 10.3 
“Frome ete i 
To simplify notation, we define r; = 1/b; to be the rate of the exponential ran- 
dom variable generating the backoff for node 7. With this notation, equation (10.3) 


becomes rı 
tx = 





(10.4) 


rı + NTrə2 i 
Note that for the purpose of analysis, we assume the backoff to be a real number, 
and not rounded off to integers, as described in Section 10.3.2. This would prevent 
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collisions from ever occurring since two neighbors have 0 probability of choosing the 
exact same backoff. In Section 10.4.3, we analyze the star topology with collisions. 
It is also worthwhile to consider the throughput-delay tradeoff in the star topol- 
ogy. The maximum throughput is achieved when the middle node is quiet, and the 
outer nodes are always on, this leads to a total throughput of n. On the other hand, a 
fair share allows the middle node to be active 1/2 the time, while the other nodes are 


active during the remainder. Consequently, the throughput is only $ + in = nH, 





10.4.2 Fairness in star topology 


Assume that all the outer nodes start with the same initial mean backoff delay. 
Then, the star topology can be evaluated by an appropriate Markov Chain whose 
states capture the ratio between the mean backoff delays (i.e., between the backoff 
rates). Let state S% designate the state when 72 = © = n.(m?)* = n.m”™, as 
shown in Fig. 10.3(b). Note that the ratio of backoff rates is a Markov chain since 
the probability of success or failure, and therefore of moving to another state, is 
completely determined by the ratio of the rates rı and r2 independently of the past. 


Following equations (10.3) and (10.4), the probability that the middle node X 


in state S% succeeds in the next slot is given by an I= -m FI . If X wins, its 
backoff rate is updated to rı := rı /m, while all the other nodes update their backoff 
rates to r2 := rz X m. As a result, the ratio between the rates becomes n.m?"~?, i.e., 
the chain moves to state Sķ—1. Similarly, from state Sk, the probability that X loses 


in the next lot is and this moves the state to Sk+1. 








n nen 1 
n.mektn — m2k+1? 


State S, => 2 = E = n.(m2)* 


interference 





Figure (a) Figure (b) 


Fig. 10.3. (a) Star topology (b) k*” state in star topology M.C. 


The Markov chain has the structure of a birth-death chain, as shown in Fig. 10.4, 
that is symmetric about state So. This chain is seen to be positive recurrent for all 
m > 1 because the average drift of the absolute value of the state is negative when- 
ever the state is not 0. We evaluate the steady state probabilities 7;,2 € Z of the 
various states. We begin by calculating 7, in terms of 7. 
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Fig. 10.4. Fairness of star topology. 





E 1 m? 

T — = | = f FET 

SAI TS Im 241 'm+l 
2 


m 
= To = 2m1 — 7> a8 T1 = T1 by symmetry 
m* +1 
x m? +1 
Ti = — T 
1 Im2 "° 


We can then use this value of mı to determine m2. 


1 m2 1 m2x2 
Tı t = To t T2 
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By repeating the process, we evaluate the steady state probability of state S% as, 


1 1 1 
Th = 5 x (1 - ) x X To. (10.5) 


mak mk(k-D) 





Finally, we can evaluate the actual probabilities: 
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5 me= mo +25 m=] (10.6) 


k=—00 


1 
=> To + 205 5 (1+ =) =l (10.7) 


1 1 
15 (1+) wee] =1 (10.8) 


k=1 





=> To 





We can evaluate 7o for various values of m. 





m/ 1.05} 1.2 | 15 | 2 4 10 
To |0.123]0.230}0.325}0.395]0.471}0.495 
































We can also prove the following theorem about the expected transmission prob- 
ability of nodes in the star topology. 


Theorem 1. Each node in the star topology has an expected transmission probability 
1 
of >: 


Proof. The transmission probability of X at state Sẹ is given by t = = 


(Fig. 10.3(b)). Then, the expected transmission probability of node X, E[t*] may 
be evaluated as 





= 3 ate = Tor stom i g ig) (10.9) 
k=—0o 
However, 
2k —2k 
tX +, = — ki =1. 10.10 
b ttk mri mF +1 ( ) 
Hence, 


x; _ 70 i al 
t |= ptm 5 (10.11) 


(from equation 10.6). aoe the expected probability of successful transmission for 


the outer nodes is 1 — = = L. 














The Markov chain model analyzed above assumes that the initial ratio between 
rı and nro is a power of m?. If the initial ratio r’ lies between two powers of m?, 
then the resultant states will be r’ x m?*,k = —oo,...,00. This positive recurrent 
chain will also drift towards states where r’ x m?* ~ 1. 

In addition to each node having an average transmission rate of 1/2, this Markov 
chain has a strong drift toward Sg, which suggests that the short-term behavior is 
fair. 
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10.4.3 Probabilistic model of star topology with collisions 


The model above is not affected by collisions in the system. A collision in the 
star topology causes all the mean backoff delays to be divided by m. As a result, 
the state does not change and the probability of success for any node, as given in 
equations (10.3) and (10.4), is unchanged. 


Calculating the probability of collision: We take into consideration the proba- 
bility of collision in the star topology and analyze the resulting chain of events. We 
model here the probability of collision due to rounding off the chosen backoff de- 
lay. Consider some outer node Z. A collision occurs with node X if the two backoff 
delays Bx and Bz are within a fixed time A of each other, corresponding to the 
duration of a backoff mini-slot. 

If Bz > By + 4, then Z will hear X’s slot capture message and keep quiet. The 
reverse occurs if Bz < Bx — a, So we only need to consider the probability that 


Bz € [Bx — 4, Bx + 4] given that Bz > Bx — 4. 


P(X collides with A at time t) 


A A A 
= P (Bx € [t,t + dt]) xP (Bre e- S.t+ DBs 2 t- 5) 





= P (Bx € [t,t+dt]) x P (Bz € [0, A]) , since Bz is memoryless. 


We know that the pdf of the exponential random variable Bx, with backoff rate 


rı = 1/bi, is given by bx(t) = r1e—™. Similarly bz(t) = rge—"2". Therefore, 
pxz = P(X collides with Z) (10.12) 
oo A 
= ip rye"! | roe "2" du | dt (10.13) 
0 0 
=(1- em) rye dt =1—e774, (10.14) 
0 


For a particular outer node Z, the collision probability pz = px z since X is the only 
node that it may collide with. 

The probability that X collides at all can now be evaluated in terms of the proba- 
bility of each of the n neighbors. This is the probability that any one of the neighbors 
picks a backoff delay in [Bx — 4 ,Bx+ 4), given that its backoff delay is > By — 4. 


px = P(X collides) = 1 — P(X does not collide with anyone) (10.15) 
=1-[(1-pxz)"]=1- [a -(1 —e-"24))"| (10.16) 
=1-e "74. (10.17) 
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Modeling event probabilities in star topology: We can use the values of px 
and pz to model the system. Recall that the backoff rates of all the outer nodes move 
together, so the transmission and collision probabilities of all the outer nodes remain 
equal, assuming they are the same initially. 

Again, let the backoff rates of the middle node be rı and the backoff rates of the 
n outer nodes be r2. Then, X attempts to transmit with a probability tx = Sen , 
from equation (10.4). And an outer node transmits with probability tz = ao 
Given tx, and px, we can calculate X’s probability of successful transmission sx = 
tx(1 — px). Similarly, we can also calculate the probability of success sz for the 
outer nodes. 








Numerically evaluating the steady state of the system: We evaluate the steady 
state of the system for values of n = 2, 4, 10, 20, 50. In each case, the star topology 
achieves a fair sharing of the bandwidth. The middle node gets the same throughput 
as the outer nodes. 

Even if we bias the initial rates (e.g. by starting the middle node with an initial 
mean backoff of 1024 instead of 16), the fairness of the Impatient Backoff Algo- 
rithm is maintained. This situation with 10 outer nodes is plotted in Fig. 10.5. While 
sx << Sz initially, it converges quickly (in less than 40 time slots) with sz. The 
two success probabilities then track each other. For the 500 time slots shown in the 
figure, the average success probabilities sx = 0.4748 and sz = 0.4750. 

If there is no initial bias, the two success probabilities converge even sooner. 
The periodic jumps in the values of sx and sz (seen in Fig. 10.5) are caused by 
the reset phenomenon — when the backoff delays reset, the collision probability 
falls abruptly, thereby enhancing the success probabilities for the next slot. Without 
a reset, the two values sx and sz converge asymptotically and stay together. 


Mode:ing the Star Topo:ogy 
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Fig. 10.5. Modeling the star topology (with initial bias against middle node). 
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The collision probabilities depends on the choice of several parameters. We as- 
sume an initial mean backoff of 16. We set A = 1, to model the case where two 
nodes collide when their backoff delays are rounded to the same integer. We also 
incorporate the reset mechanism outlined in Section 10.3.4. We set the reset limit 
Ry = 16/5 and the reset factor Rp = 10. 


Fairness in star topology with changing number of neighbors: So far we have 
assumed that the nodes always have packets to transmit. Now we study the more 
realistic situation when the nodes only want to transmit some of the time. Every 
1000 time slots we randomly pick a subset of the 10 outer nodes to go to sleep, while 
the middle node always wants to transmit. Thus the relative backoff delays go awry 
every 1000*” time slot and relies on the IBA mechanism to get back in sync. 

The results are plotted in Fig. 10.6. The blue line plots the success probability 
sx for the middle node while the red line plots the success probability sz for the 
outer nodes (the two lines are too close to identify separately). The number of active 
nodes n during each block of slots is also plotted using the dotted line. 

As seen in the figure, sx and sz track each other closely and neither is able 
to overwhelm the other, even with a changing number of outer nodes. The average 
success probabilities over the entire time duration is Sx = 0.4722 and sz = 0.4719, 
thus yielding a fairness index of 1. 
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Fig. 10.6. Modeling the star topology with a changing number of neighbors. 


10.4.4 Triangle topology model 


The other topology that we analyze is the triangle topology, where each of three 
nodes interferes with the others. Clearly, the situation is symmetric, but we need to 
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ensure that starting at any set of backoff rates, the system drifts back towards a state 
where all nodes have an equal probability of successful transmission. 

The evolution of the system is modeled by a Markov chain that represents the 
backoff rates of the three nodes. A state is denoted by the triple (1, b, c) that specifies 
the backoff rates of the three nodes in increasing order. We express each of the rates 
as a multiple of the smallest rate, so the first term is always 1. Thus the probability of 
the first node winning the contention is eee while the other two nodes win with 
probability m and Ibro’ respectively. Every transmission results in one backoff 
rate being divided by m, and the other two rates being multiplied by m. As a result, 
the ratio of rates between any two nodes changes by a factor of m?; consequently 
values b and c in the Markov chain are powers of m°. In Fig. 10.7, we represent 
the Markov chain resulting from a triangle topology. In the figure, we use a value of 
m = 2 as an illustration. 























Fig. 10.7. Markov chain of triangle topology. 


The Markov chain is clearly irreducible, but it is periodic with a period of 3. 
In order to conclude that the system is fair, we need to show the Markov chain is 
positive recurrent. 


Theorem 2. The Markov chain that models the backoff rates in the triangle topology 
is positive recurrent. 


Proof. Let A be a finite set of the states (to be defined below), including (1,1, 1). 
Define the function f as 


f(1,b, c) = log,,2 b + logne c. (10.18) 


Given state S, we further define the Lyapunov drift function 
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y(5)=-f(9)+ | X Psr.f(T) J. (10.19) 


TEN(S) 


Here N(S) are the states neighboring to state S, and Psr is the transition probability 
from state S to state T. By Pakes’ Lemma [15], if there is some e such that 7(5) < 
—e<0,V S ¢ A, then the chain is positive recurrent. Let 0 < € < 1. By observing 
the structure of the Markov chain, we notice that there are three varieties of states as 
illustrated in Fig. 10.8. 





Fig. 10.8. Evaluating Lyapunov function on the triangle topology M.C. 


Case 1 (Middle State): In this case, there are three possible neighboring states: 
(1,bm?, em?), (1, “,c) and (1,0, -5), with transition probabilities as shown in 
Fig. 10.8(a). Then, 


(S) < -e 





=> — (logm2 b + logm2 ¢) (logm2 (bM?) + logm2(cm?)) 


U rSTT 


s A omaa i + (loga b +0842 =z) < 
1+b+e Bm2 m2 OS m2 C Lobes OS m2 T LOS 2 m € 


=> — (log,,2 b + log,,,2 c) 4 





+ ——_—_ (1+ ] b+14+1 
ee ror, + logm2 0 + + log,,2 ©) 











FTE obm2 b- 1+ lok m2 HE (log,,,2 b + logm2 € — 1) < -e 
1 
EESTI. b-c) < —e 
9:4 
=>; re 


We can verify that this inequality holds when 
b+e>3. (10.20) 
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Case 2 (Side State): The side state is characterized by (1,1,c), and has two 
possible neighbors: (1, m?, cm?) and (1,1, £z). From Fig. 10.8(b), 


359 m2 











2 
(S) = — (log,,2 c) + aur (1+ 1+ log,,2¢) + a (logm2 € — 1) < —€ 
1 

> (4—c) < -€ 

2+¢ 

4+ 2e 
=> <c 

l—e 
This is true when 

c>5. (10.21) 


Case 3 (Bottom State): The bottom state is characterized by (1, b,b), and has 
two possible neighbors: (1, bm?, bm?) and (1, z, b). From Fig. 10.8(c), 


il 
(S) = — (logm2 b + logm2 b) + ———~ (1 + log,,,2 b + 1 + log,,,2 b) 








2641 
die” ine 2b— 1+ log 2b) < —€ 
2b+1 d m 
1 
=a 2b) < -e 
z 2+e 25 
2(1=6) `` 


This is true when 
b> 2. (10.22) 


The state variables b and c may be expressed as m?*, k € Z*+. Depending on the 
value of m, we can choose A s.t. (1,b,c) € A <= b=m?" < 2 and c = m?*2 < 
5. Then, y(S) < —e < 0 Y S ¢ A, since we satisfy equations (10.20)-(10.22). By 
Pakes’ Lemma, the chain is positive recurrent. 














This analysis allows us to conclude that no matter what the initial backoff delays 
are, the three nodes will drift towards states where their backoff delays are equal. 
This observation is also confirmed by simulation results in Section 10.6.4. 


10.5 Choosing Algorithm Parameters 


IBA is characterized by three design parameters m, Rz and Rr whose choices 
affect the performance of the system. In this section, we present qualitative argu- 
ments for our choices. 
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10.5.1 Reset limit Rz 


The reset limit Rz is the smallest value to which we allow the mean backoff 
to fall until it is reset. Having a very small value for Rz allows nodes to maintain 
low mean backoff delays. Since the actual number of backoff slots is rounded to the 
nearest integer, multiple neighbors with low mean backoff delays are more likely to 
choose the same backoff, leading to frequent collisions. 

Choosing a large value for Ry will alleviate this problem but leads to large back- 
off values. In our current model this has no ill-effect since all nodes always wait for 
the completion of the backoff contention phase before attempting to transmit. For 
simulation purposes, we choose Ry; = £ = 3.2, which is 1/ 5th of the initial mean 
backoff value. 


10.5.2 Reset factor Rp 


Consider a dense subgraph in the Ad-Hoc network, e.g. a clique topology with 
q nodes. In such a situation, at each slot, at most one node succeeds while all the 
others fail. Consequently, q — 1 of the nodes divide their mean backoff delay by m. 
For a large enough q, at least one of the nodes repeatedly decreases its rate and cross 
Rz, causing a reset. Quantitatively, resets occur approximately every log,,, Rr slots, 
provided log,,, Rr < q. 

Choosing a large Rp indeed decreases the reset frequency, but this decrease hap- 
pens only on a slower logarithmic scale. In our simulations we choose Rp = 10. 


10.5.3 Multiplicative factor m 


In Section 10.4.3 and 10.4.4, we showed that IBA is stable and fair in the star 
and triangle topologies for m > 1. In Section 10.6.2, simulations show this to be 
true even for random topologies. However, we need to select an optimal value for m. 

A small value of m allows nodes to maintain a lower mean backoff delay on 
average, since the backoff rate may be closer to Rz without hitting it. This leads to 
more collisions, following the same argument as Section 10.5.1. On the other hand, 
a large value of m causes frequent resets. In a dense network, the frequency of resets 
is approximately log,,, Rr. 

The compromise is to choose as large a value of m as possible (to minimize 
collisions), yet choose it small enough to keep resets under control. For a practical 
system, this is entirely a design decision based on the efficiency of the reset mech- 
anism. We chose m = 1.2 for our simulations. In this case, the reset frequency is 
bounded above by the rate of one reset every log, 5 10 = 12.6 slots. 


10.6 Simulation Results 


We have built a simulation model using Matlab [16] that allows us to compare 
the performance of the Impatient Backoff Algorithm (Section 10.3) with a slotted 
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exponential backoff algorithm (EBA). EBA chooses a backoff uniformly in a given 
range, starting with [0,32]. The node with the smallest backoff in its neighborhood 
is able to transmit in that slot. A collision doubles the range, while a successful 
transmission brings it back to the initial range. No changes are made to the backoff 
when the node is quiet. We evaluate over a wide range of topologies, as we explain 
below. 


10.6.1 Simple topologies 


We first evaluate our scheme by simulating simple topologies. The results are 
summarized in Table 10.6.1. For each topology, we list the mean throughput and the 
fairness, for both EBA and IBA. As explained earlier, we use Jain’s Fairness Index 
(Section 10.3.5) to compare the two schemes, assuming an optimally fair allocation 
lets each node get an equal share of the throughput. 














Exponential Backoff (EBA)| Impatient Backoff (IBA) 
Topology Mean Throughput} Fairness |Mean Throughput) Fairness 
Star with 2 arms 0.5498 0.91 0.4878 1 
Star with 4 arms 0.5537 0.89 0.4866 1 
Star with 10 arms 0.5519 0.90 0.4880 1 
3 node clique 0.3201 1 0.3094 1 
5 node clique 0.1871 1 0.1745 1 
10 node clique 0.0890 1 0.0777 1 
Square 0.4792 1 0.4773 1 
Pentagon 0.3863 1 0.3832 1 
Hexagon 0.4267 1 0.4671 1 























Table 10.1. Simulation results for simple topologies 


In the star topologies, EBA is unfair, while IBA has a fairness of 1. However, 
in order to achieve fairness, IBA has to pay a throughput tradeoff. The rest of the 
topologies are symmetric, so both schemes are fair in such topologies. We observe 
that the mean throughput of IBA is comparable to that of EBA in all these cases. 


10.6.2 Random topologies 


A more interesting comparison is visible in random topologies. We first consider 
a random field of size 4km x 4km and place 100 Ad-Hoc nodes at random on it. The 
nodes have a transmission range of 500m and an interference range of 1km. Every 
node is assumed to have packets to send at all times. Note that at any slot, multiple 
transmission can take place at different parts of the field. 

Figs. 10.9 and 10.10 express the simulation results for EBA and IBA, on the same 
topology. We denote a node by a circle with its center at the node’s location. The area 
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Fig. 10.9. Node throughput in a random topology: Exponential Backoff. 
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Fig. 10.10. Node throughput in a random topology: Impatient Backoff. 
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of the circle is proportional to the throughput achieved by the node (i.e., number of 
successful transmissions) scaled by the total number of slots. 

By comparing the two graphs visually, we can qualitatively see the fairer nature 
of IBA. EBA in Fig. 10.9 includes many nodes with very small throughput — all 
of whom are able to increase their throughput in IBA. Jain’s Fairness Index in the 
EBA simulation is only 0.66 and it increases to 0.75 for IBA. The lowest throughput 
achieved by a node in EBA is merely 0.0090, while the lowest throughput in IBA 
is 0.0490. The mean throughput for the two schemes is comparable, at 0.1066 and 
0.1046. 

Table 10.6.2 summarizes the results from simulations across several random 
topologies. In each case, we generated a different random topology and ran the simu- 
lations on both backoff schemes. We notice that the mean throughput is comparable, 
while IBA achieves a significantly higher fairness. Also, the minimum throughput 
received by a IBA node is 3 to 5 times higher than in EBA. 





Exponential Backoff Impatient Backoff 
Topology Mean Thrpt| Min Thrpt|Fairness|Mean Thrpt|Min Thrpt| Fairness 
100 nodes; 4 x 4km” 0.1066 0.0090 | 0.66 0.1046 0.0490 0.75 
100 nodes; 4 x 4km? 0.1029 0.0140 | 0.71 0.1001 0.0440 0.82 
100 nodes; 4 x 4km? 0.1032 0.0120 | 0.71 0.1031 0.0380 0.81 
50 nodes; 2.5 x 2.5km”| 0.0842 0.0130 | 0.71 0.0839 0.0490 0.83 
50 nodes; 2.5 x 2.5km?| 0.0963 0.0220 | 0.68 0.0903 0.0430 0.74 
25 nodes; 2 x 2km” 0.1188 0.0040 | 0.69 0.1004 0.0590 0.87 






































Table 10.2. Simulation results for random topologies 


10.6.3 Reset propagation and lost resets 


As explained in Section 10.3.4, IBA requires all nodes to reset their mean backoff 
delays when any mean backoff goes below a reset limit. In a practical situation, this 
requires the propagation of the reset message through the network. We model this in 
our simulations by propagating the reset message hop by hop from the originating 
node. A node resets its backoff only when it receives a reset message. The reset 
messages have a time-to-live field to ensure expiration after a single reset. Also, a 
node does not reset more than once in a fixed number of slots, thus multiple reset 
messages starting from different parts of the network at around the same time cause 
the backoff delays to be reset only once. 

Furthermore, reset messages may get lost in the way. Our simulations account 
for situations where up to 10% of reset messages get lost. 

It turns out that reset propagation and loss has marginal effect on the throughput 
and fairness of IBA. The simulation results given above in Section 10.6.2 in fact take 
into account both these effects. Qualitatively, we can see why the reset propagation 
and loss does not affect the long term fairness of the system. A delayed or lost reset 
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message implies that the intended recipient node does not reset its backoff in a timely 
manner and hence continues to have a low backoff. However, the result of this is that 
it wins a few contentions unfairly — which in turn causes it to increase its backoff. 
The unfairness only allows a few extra packet transmissions and does not persist. 
Thus the scheme is fair in the long run. 


10.6.4 Variations in simulation scenarios 


We also attempt a few variations in the simulation scenarios. In each of the cases, 
the throughput and fairness numbers are comparable to the previous cases, hence we 
do not repeat them explicitly. 

Movement: We want to investigate the effect of random walk movement on IBA. 
Every node moves randomly in discrete steps every 10 time slots. 

Changing Number of Nodes: We also consider the case when nodes do not 
always have packets to send. This is simulated by considering blocks of slots when 
a node is either awake or asleep. For simple topologies, we evaluate the fairness of 
IBA against the fair share 

of bandwidth based on the active nodes during each block of slots. The results 
show IBA to maintain its fairness even with a changing number of neighbors. 

An example is shown in Table 10.6.4 for a clique of five nodes. Different nodes 
have a different fair share since they are active for varying amounts of time, but 
the IBA scheme is able to provide them with that share. These experiments yield a 
fairness of 1. 





Node No.|Throughput| Fair Share 
1 0.2040 0.2040 
0.2220 0.2217 
0.1330 0.1340 
0.1710 0.1717 
0.2080 0.2073 

















ABW bd 





Table 10.3. IBA maintains fairness even when not all nodes are transmitting 


Biased Initial Backoff Rate: We also extend the idea of an initial bias (Section 
10.4.3) to a general topology. We start with an initial bias against a few selected 
nodes by making their initial mean backoff 64 times other mean backoff delays, and 
observe whether these nodes indeed get a fair share of the throughput. Under IBA, 
these backoff delays soon catch up and the nodes subsequently get an equal share of 
the bandwidth. While their initial backoff delays are high, these nodes tend to lose 
the contention, causing their mean backoff delays to be divided by m. Thus they 
catch up in approximately log,,, 64 slots. 
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10.7 Conclusions 


Traditional distributed Ad-Hoc MAC protocols (e.g., 802.1 1b) use random back- 
off delays to avoid collisions and share bandwidth amongst contending nodes. The 
fundamental backoff rule is to become less aggressive upon collision. We observe 
that in a network spanning multiple interference domains, this approach leads to un- 
fairness towards nodes in the middle of the network. 

We propose a novel backoff scheme that counters the conventional backoff wis- 
dom. Nodes in our Impatient Backoff Algorithm decrease their backoff when they 
collide or are unable to send — thereby becoming more aggressive. This approach 
tends to help nodes with more neighbors and leads to a fairer allocation of bandwidth. 
The danger of the system becoming unstable due to frequent collisions is handled by 
resetting the mean backoff delays when they get too low. 

We use Markov chains to analyze IBA in simple topologies. We look at two 
extreme topologies — the unfair star topology, and the symmetric clique topology. 
By proving positive recurrence of the system, we show that IBA is indeed stable. We 
are also able to demonstrate fairness of the system, even as the number of nodes in 
the topology is changing dynamically. 

We compare the performance of IBA with an idealized slotted exponential back- 
off scheme. Results show that in simple topologies IBA is always able to achieve 
a fairness index of 1, albeit at the cost of some throughput tradeoff. In a random 
topology, IBA maintains the same mean throughput as EBA but has a significantly 
higher fairness index. Further extensions involving movements and nodes that switch 
between active and sleep phases also give similar results. 


10.8 Extensions and Future Work 


Clearly, the MAC model outlined in Section 10.3 is an idealized version that 
ignores many practical effects. The most important of these is the slotted nature of 
the protocol. It would be most interesting to develop an un-slotted version of the 
protocol, which would allow different packet lengths and also remove the need for 
the nodes to be synchronized. An RTS/CTS mechanism for IBA will also be useful 
to solve the hidden terminal problem. 

IBA is by no means a complete MAC protocol. It is however a radically differ- 
ent approach to backoff mechanisms in distributed Ad-Hoc networks. This chapter 
shows the benefits of the novel scheme, and also suggests mechanisms to make IBA 
practicable. The hope is for follow up work in this space to make this idea a reality. 
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11.1 Introduction 


As an important concept in network security, trust is interpreted as a set of 
relations among agents participating in the network activities. Trust relations are 
based on the previous behavior of an agent within a protocol. Trust establishment 
in distributed and resource-constraint networks, such as mobile ad hoc networks 
(MANETs), sensor networks and ubiquitous computing systems, is much more dif- 
ficult but more crucial than in traditional hierarchical architectures, such as the In- 
ternet and base station- or access point-centered wireless LANs. Generally, this type 
of distributed networks have neither pre-established infrastructures, nor centralized 
control servers or trusted third parties (TTP). The trust information or evidence used 
to evaluate trustworthiness is provided by peers, i.e. the agents that form the network. 
Furthermore, resources (power, bandwidth, computation etc.) are normally limited 
because of the wireless and ad hoc environment, so the trust evaluation procedure 
should only rely on local information. Schemes that depend only on local interac- 
tion also have the desired emergent property that enables fast reaction to network 
member changes, topology changes and security changes that frequently happen in 
mobile networks. Therefore, the essential and unique properties of trust management 
in this new paradigm of wireless networking, as opposed to traditional centralized 
approaches are: uncertainty and incompleteness of trust evidence, trust value is 
between —1 and 1; locality in trust information exchange; distributed computa- 
tion. 

Trust establishment is a process starting from a small set of agents who are known 
to be trustworthy. For example, the first few peers to join a network are often known 
to be trustworthy, while the majority are neutral, i.e. with trust value 0. They are 
subsequently evaluated by agents who have direct interaction with them. Those eval- 
uating agents are either the physical or logical neighbors of target agents. Based on 
their observations and evidence, they are able to provide opinions on the target agent, 
to build the trust value (also called reputation) of the target agent. The whole network 
therefore evolves as the local interactions iterate from “isolated trust islands” to “a 


184 J.S. Baras and T. Jiang 


connected trust graph.” Our interest is to discover rules and policies that establish 
trust-connected networks using only local interactions, to understand the impact of 
local interactions on the whole network and also to find the conditions under which 
trust spreads to a maximum set, as well as the parameters that speed up or slow down 
this transition. 

There have been several works on trust computation based on interactions with 
one-hop physical neighbors. In [2], for instance, first-hand observations are ex- 
changed between neighboring nodes, where node A adjusts his opinion for B, based 
on how close B’s evidence is to A’s previous opinion about another node C. It pro- 
vides an innovative model to link nodes’ trustworthiness with the quality of the evi- 
dence they provide. Our work emphasizes the inference of trust value instead of gen- 
erating the direct trust, which is similar to [7] and [8], where weighted averages were 
used to aggregate multiple votes for trust evaluation and provided promising results 
on using this simple local interaction rule to correctly evaluate trust in distributed 
networks. Particularly in [7], different kinds of malicious behaviors have been sim- 
ulated and their results showed that by ranking nodes according to the trust value, 
the network application (in their case, file downloading in p2p networks) doesn’t get 
affected by malicious nodes. However, the results in both [7] and [8] are based on 
simulation. In this chapter, we analyze a local interaction rule using graph theory 
and provide a theoretical justification for network management that facilitates trust 
propagation. 

In wireless networks such as mobile ad hoc networks and sensor networks, most 
of the functions (routing, mobility management, and security) must rely on coop- 
eration between nodes. In addition, such cooperation utilizes local information and 
local (between neighbors) interactions. This is probably the most important differ- 
ence between this type of networks and traditional networks, such as the Internet and 
cellular networks. 

In the wireless networks of interest in this chapter, nodes are not under the control 
of any central authority. In other words, each node is its own authority. The network 
is generated in a more distributed and asynchronous manner. In this situation, the 
most reasonable assumption is that each node will try to maximize its benefit by ex- 
ploiting the network, even if this means adopting a selfish behavior. This selfishness 
means that nodes are not willing to participate, without additional incentives, in the 
common networking functions, such as route discovery, packet forwarding and secu- 
rity management, which are always resources consuming, including power batteries 
and bandwidth consumption. 

Over the last few years, there has been an increasing amount of research on de- 
signing mechanisms to encourage nodes to collaborate. Basically, the approaches 
taken can be divided into two categories: one is based on incentive techniques, which 
normally rely on various kinds of trust or reputation systems to promote cooperation 
and circumvent misbehaving nodes [2, 3, 9]; the other is inspired from game theory, 
where payoffs are assigned to different strategies of nodes, and Nash equilibria in 
non-cooperative games are considered to be the optimal and stable solutions [5, 13]. 

In our chapter, the interactions among nodes are also modeled as games, which 
are cooperative games rather than non-cooperative games, where players always 
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conflict. In cooperative games, players form coalitions to obtain the optimum pay- 
offs. The key assumption that distinguishes cooperative game theory from non- 
cooperative game theory is that players can negotiate effectively [10]. We will discuss 
how negotiation can help to form the grand coalition that includes all players to- 
gether. Another way to form a grand coalition is through a trust establishment mech- 
anism: nodes which do not cooperate will be penalized by the trust establishment 
mechanism. How trust establishment mechanisms can help in cooperative games is 
also analyzed. Furthermore, we show that trust establishment and evolution of co- 
operation go hand in hand by viewing the whole network as a distributed dynamical 
system. 

As discussed, trust computation is distributed and restricted to only local inter- 
actions in a MANET. Each node, as an autonomous agent, makes the decision on 
trust evaluation individually. The decision is based on information it has obtained 
by itself or from its neighbors. Those aspects are analogous to situations in statis- 
tical mechanics of complex systems with game theoretic interactions. Game theory, 
and more specifically the theory of evolutionary games, provide the framework for 
modeling individual interactions. This circle of ideas has a lot in common with ran- 
domized optimization methods from statistical physics. 

One of the simplest local interaction models is the Ising model [11], which de- 
scribes the interaction of magnetic moments or spins, where some spins seek to 
align with one another (ferromagnetism), while others try to anti-align (antiferro- 
magnetism). The Ising spin model consists of n spins. Each spin is either in position 
“up” or “down.” Any configuration of spins is denoted as s = {51, S2,..., Sn}, 
where s; = 1 or — 1 indicating spin 7 is up or down respectively. A Hamiltonian, or 
energy, for a configuration s is given by 


1 mH 
H(s)= -7 DD Jijsisj — = 2 5i (11.1) 


ViEV, JEN: i 


where T is the temperature. The first term represents the interaction between spins. 
The second term represents the effect of the external (applied) magnetic field. In 
the Ising model the local interaction “strengths” are all equal to a parameter J. In 
the more complex case of spin glass the J;; are different and may even come from 
random processes [11]. 

The problem of computing the ground state (global minimum of energy) for the 
Ising model (and even more so for spin glasses) is an NP-hard problem. There are 
2” possible configurations for the model, the computation becomes infeasible when 
n gets large. So we must use heuristic methods to find low energy configurations. 
As proposed in [1], we could imagine that the spins try to reduce their own frus- 
tration (or energy) individually, and come up with an interesting cooperative game. 
In game theoretic terms, the payoff for node 7, when the graph has a configuration 
S = {51,52,...,5n}, is 

m= >> Jiysis;. (11.2) 


JEN: 
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When Jij = 1, the agents are rewarded for aligning their spin states; when J;; = —1 
they want to take on opposite states (anti-align their spins) in order to maximize their 
payoffs. Agents interact in order to maximize their own payoffs. 

This model provides the inspiration for our approach, as it can be directly used 
for distributed trust computation. Let s; be the trust value assigned to node 7, where 
si E€ {—1,1}. Node i will be assigned a trust value according to the opinion of the 
majority of its neighbors. We set J;; = 1,Vj € Ni. Then the payoff of agent i 
is m; = si), JEN; Ŝi In order to maximize 7;, 2 will set s; with the same sign as 
ye jen; Sj» Which is actually the same value as neighbor majority vote. Simulations 
using Simulated Annealing (SA) show that the average payoff of the whole network 
is a function of the temperature T’ in the Ising model. High temperatures, in the trust 
computation context, mean that the agents are very conservative and not willing to 
change their trust values, the payoffs are near 0, which is the expected payoff for 
a random set of s; from {—1,1}. While, as the temperature decreases (aggressive 
agents), the algorithm becomes greedier and payoffs increase, most of the nodes will 
reach agreement. Recently there has been very strong interest in the application and 
extension of such optimization schemes from the statistical mechanics of spin glasses 
and associated games to optimization and other problems in information technology 
[11]. 

In the Ising model, and the more complex models of spin glasses, an important 
characteristic is phase transition phenomena. It is observed that when the tempera- 
ture is high, all the spins behave nearly independently (no long-range correlation), 
whereas when temperature is below a critical temperature co, all the spins tend to 
stay the same (i.e., cooperative behavior). Phase transitions are also studied in evolu- 
tionary prisoner’s dilemma games [14]. Phase transition is a common phenomenon 
that takes place in any combinatorial structure, where a large combinatorial struc- 
ture can be modeled as a system consisting of many locally interacting components. 
A phase transition corresponds to a change in some global (macroscopic) parame- 
ter of the system as the local parameters are varied. Distributed trust computation is 
essentially a cooperative game where nodes interact with their neighbors locally. 

The structure of the chapter is as follows. In Section 11.2 we develop the network 
model and the framework of cooperative games for analyzing cooperation among 
the agents. In Section 11.3 we analyze the cooperative game framework and show 
that agent cooperation can be achieved employing negotiations between the agents. 
We also develop a dynamic distributed trust mechanism framework and demonstrate 
that it can also induce cooperation among agents, albeit without negotiations. In 
Section 11.4 we investigate the dynamic evolution of both cooperative games and 
trust mechanisms and establish certain quantitative measures and characteristics of 
the “spread” of cooperative behavior among agents. Finally, Section 11.5 contains 
our conclusions and a brief description of future research directions. 
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11.2 Problem formulation 


11.2.1 System model 


The network is modeled as an undirected graph G(V, Æ). Throughout this chap- 
ter, we use the terms node, player and agent interchangeably, where a node 7 is an 
element in the set V. Nodes are players that play games among themselves. Since 
we only consider direct interaction among nodes, nodes only play games with their 
neighbors, which are denoted as: 


Ni È {jI j) € E} S {1,..-,N}\ {i} 


The neighbor set of agent i, Ni, can represent the set of agents with which 27 is 
allowed to communicate (giving rise to a logical interconnection network), or the set 
of agents which 2 can sense, transit or receive information from (physical wireless 
communication links). 
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Fig. 11.1. System operation block-graph for a typical node. 


In our model, each node has a self-defined playing strategy, which is denoted 
by y; for node i. Another characteristic of each node is its trust values, which are 
dependent on the opinions of other nodes. Trust values of a node can be different for 
different players. For instance, tj; and tx; are the trust values of 7 provided by distinct 
player j and k, and possibly tj; A tki. Fig. 11.1 is a block graph demonstrating how 
nodes interact among their neighbors, where the payoff of node t after playing games 
is represented as x;. The procedure is summarized as the following three rules: 
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e Strategy updating rule: as shown in Fig. 11.1, nodes update strategies based on 
their own payoffs. They tend to choose rules that obtain the maximum payoffs. 

e Payoff function: the payoffs are functions of the strategies of all participants. For 
a specific node, the payoff only depends on strategies of its neighbors and itself. 

e Trust computation rule: trust values are computed based on votes, which are pro- 
vided by neighbors and are related to the history (strategies and trust values) of 
the target node. Since trust values eventually have impact on the payoff of the 
node, there is a dotted line in Fig. 11.1 from trust values to payoff to represent 
their implicit relation. 


For simplicity, we assume the system is memoryless. All values are dependent 
only on parameter values at most one time step in the past. Therefore, the system can 
be modeled as a discrete-time system: 


qilt +1) = f'(i), yilt) WO), tes) (11.3) 
tin(t) = g (ti (t), vje(t)) Yk eN (11.4) 
xilt) = h (lt), yE) (11.5) 
viz (t) = p (yy (2), tye(t)) (11.6) 


where j stands for all neighbors of 7, and v;; is the value node 7 votes for j. In Sec- 
tion 11.4, we will analyze the dynamics of the system, especially the effect of trust 
propagation on the formation of cooperation. We first introduce the basic element of 
this system: the cooperative games among neighboring nodes. 


11.2.2 Games 


In this part, we give the formal definitions of the interaction games. In our work, 
we consider two-person games with perfect information, say, player (or node) P; 
interacts with player (or node) P». 


Definition 1 (Strategy). A strategy +; for P; is the alternative P; chooses based on 
the information it currently holds. The set of all strategies of P; is called his strategy 
set (space), and it is denoted by I;;. 


Definition 2 (Payoff). The payoff of player P; is the function of the strategies of both 
players, which is denoted by x; = fi (71,72). 


In a game, two rational players choose their strategies based on the information 
they have, and aim to achieve the optimum payoff. Games are generally divided into 
two categories: non-cooperative games and cooperative games. The essential differ- 
ence of these two types of games is that in cooperative games players are allowed to 
negotiate while in non-cooperative games players play the game for their own sake. 
Therefore, in cooperative games, correlated mixed strategies are allowed, and the 
payoff can be transferred from one player to the other (though not always linearly). 
In what follows we will compare two different games by providing simple example 
games; our game model is based on a simple cooperative game and the interactions 
among neighbors. 


11 Cooperation, Trust and Games in Wireless Networks 189 


Non-cooperative vs. cooperative games 


One of the most well-known models in two-player non-cooperative games is the 
prisoner’s dilemma. In the prisoner’s dilemma, the strategy sets of both players are 
IT; = {cooperate, defect}. Then there are four combinations for (71,72) and the 
payoffs of two players are assigned in a matrix form as shown in Table 11.1, where 


Pı 
C D 
P> |C|(r,r) (s, t) 
D| (t, 5) |(p, p) 
Table 11.1. Payoff matrix of prisoner’s dilemma. 


























“C” stands for cooperate and “D” for defect. The payoffs are related to whether 
players cooperate or not and to what extent. For each possible pair of strategies, r is 
the “reward” payoff that each player receives if both cooperate, p is the “punishment” 
that each receives if both defect, t is the “temptation” that each receives if he alone 


defects and s is the “sucker’s” payoff that he receives if he alone cooperates. The 
payoffs satisfy the following chain of inequalities: 


t>r>p>s. 


Players try to maximize their payoffs. For player P}, strategy D is strictly dominant 
to the strategy C: whatever his opponent does, he is better off choosing D than C. 
By symmetry, D also strictly dominates C for player P2. Thus two “rational” players 
will defect and receive a payoff of p, while two “irrational” players can cooperate 
and receive greater payoff r. 

In cooperative games, players are allowed to negotiate and use the strategies 
according to their committed agreement. Under such an assumption, rational players 
either cooperate at the same time or defect simultaneously. If two players do not 
cooperate, the payoff they get is called the disagreement vector f* € R?. If they 
cooperate, the players negotiate about which point in the set of feasible payoffs L € 
R? they will agree upon. So in cooperative games we need to investigate: 1) whether 
players are willing to reach a consensus on which feasible payoff to realize; 2) how 
to allocate the payoffs among the players. We can analyze a simple cooperative game 
that is a modification of the prisoner’s dilemma: the disagreement vector f* = (p, p), 
for simplicity let p = 0 and let the payoffs be defined as 


xı = f(a2) — cay 
rq = f(a) — caz 
aitaz <E 
where a, and ag are some limited resources (with limit Æ) shared by two players, 


such as money or bandwidth in the network context, and f be a concave function. 
Fig. 11.2 depicts an example of the players’ payoffs. 
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L (feasible payoffs) 











Fig. 11.2. Illustration of a two-player cooperative game. 


The negotiation result x = (x, x2) satisfies the following conditions 


1. x € L (feasibility); 
2. x > f* (rationality); 
3. x’ € L, x’ > x imply x’ = x (Pareto-optimality). 


Then the boundary of the compact, convex feasible set D = LN {x : x > f*}, ie. 
the curve zı = g(a) in Fig. 11.2, is the set of candidates for negotiation. Then the 
question is: on which point the agents would agree on if they cooperate? This will be 
discussed in Sect. 11.3. 


Games on networks 


In this chapter, we consider cooperative games on networks, where nodes play 
cooperative games with their neighbors iteratively. Assume that at each time step, 
two neighboring nodes only play the game once. Cooperative games are normally 
represented by the characteristic function form which is a finite set N = {1,..., N}, 
the set of players and a function (characteristic function) v : 2% — R. defined on 
all subsets (coalitions) of N with v() = 0. We denote such a game as I’ = (N,v). 
Define S, a subset of N, as a coalition if all nodes in S cooperate. Then v(S') is 
interpreted as the maximum utility (payoff) S can get without the cooperation of the 
rest of the players NV \ S. In order to simplify our analysis, we assume the payoff only 
depends on the interacting two parties and the feasible payoff set of the two-player 
game is shown in Fig. 11.2. Suppose y;; is the payoff of i from the game between i 
and j. Since games are played on networks, y;; # 0 only if i and j are neighbors, 
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and set y;; = 0 if i = j or t and j are not neighbors. For instance, consider two 
neighboring nodes i and j and let S = {i,j}, then 


a(S) = max{ yg + ya}. (11.7) 


Apparently, the payoffs that maximize v(s) are on the Pareto frontier of the convex 
set L. Substitute yi; = g(y;i) into (11.7), and we can derive the payoffs that maxi- 
mize u(s), denoted as (x;;, x;;). In a geometric interpretation, (£j, £ji) is the point 
on the boundary of L from where a tangent to L can be drawn with slope —1. It is 
obvious that x = (xij, £ji) satisfies the negotiation conditions. 

The following are assumptions made and used in this chapter: 


e The games are with transferable utility, i.e., payoffs were given in linearly trans- 
ferable utility. 

e The cooperation is bilateral, i.e. for two neighboring nodes, either both cooperate 
or none cooperates. This is because there is no incentive for a node to altruisti- 
cally contribute without receiving some payoff. 

e Nodes cooperate with all the neighbors in the same coalition. If ¿ is in coalition 
S, j € N; and j € S, then i cooperates with j. 


As we defined, a coalition is a subset of nodes that cooperate with all their neigh- 
bors in the coalition. Among all coalitions, there are so-called maximum coalitions 
which are not subsets of any other coalition, i.e., if S is a maximum coalition, then 
Vi € S, j £ S, i and j do not cooperate with each other. In this chapter, all coalitions 
are maximum coalitions, so we omit maximum from now on. We could easily find 
the characteristic function of our cooperative game, which is the summation of the 
payoffs from all cooperative pairs in the coalition, as: 


o(S)= X ayy. (11.8) 


ijes 


Notice that Vi, v({i}) = 0. We denote the cooperative game defined from (11.8) as 
I =(N,0v). 

In the next section, we will describe the details of the system model. Based on the 
model, we will investigate stable solutions for enforcing cooperation among nodes, 
and demonstrate two efficient methods for achieving such cooperation: negotiation 
and trust mechanism. 


11.3 Cooperation in games 


11.3.1 Cooperative games with negotiation 


In Section 11.2.2, we reviewed and defined games, especially cooperative games 
that are used in our interaction model. In this section, we investigate the impact of 
the games on the collaboration in a network. First we start with a simple fact. 


Lemma 1. If Vi, j, £ij + zji > 0, then I = (N,v) is a superadditive game. 
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Proof. Suppose S and T are two disjoint sets (S N T = 9), then 


v(SUT) = ` tij = X tig t+ J. tyt 5 (xij + Tja) 


i, jESUT iJEs iJEeT ics jET 
=v(S)+u(T)+ J` (wig + zj) > v(S) +T). 
ics, jET 














The last inequality holds by our assumption that zij + xj; > 0. 


The main concern in cooperative games is how the total payoff from a partial or 
complete cooperation of the players is divided among the players. A payoff allocation 
is a vector x = (2;)ien in R^, where each component x; is interpreted as the payoff 
allocated to player 7. We say that an allocation x is feasible for a coalition S iff 
Seg Ti < (9), 

When we think of a reasonable and stable payoff, the first thing that comes to 
mind is a payoff that would give each coalition at least as much as the coalition 
could enforce itself without the support of the rest of the players. In this case, players 
couldn’t get better payoffs if they form separate coalitions different from the grand 
coalition JV. The set of all these payoff allocations of the game I = (N, v) is called 
the core and is formally defined as the set of all n-vectors x satisfying the linear 
inequality: 


x(S) >v(S) VSCN, (11.9) 
x(N) =v(N), (11.10) 


where x(S) = 0-9 xi for all S C N. If T is a game, we will denote its core by 
C(I). It is known that the core is possibly empty. Therefore, it is necessary to dis- 
cuss existence of the core for the game I’. We first give the definition of a family of 
common games: convex games [6]. The convexity of a game can be defined in terms 
of the marginal contribution of each player, which plays the role of first difference 
of the characteristic function v. Convexity of v can be defined in terms of the mono- 
tonicity of its first differences. The first difference (or the marginal contribution of 
player i) d; : 2N — R. of v with respect to player i is 


ce. Jisui —u(S) ifs ¢ S 
ake) ee ait SV). dees. 


A game is said to be convex, if for each i € N, d;( S) < d;(T) holds for any coalition 
SCT. 


Lemma 2. T(N, v) is a convex game. 


Proof. For I’, di(S) = >> (xj; + xij). Taking two sets S C T, 


jES,jżŻi 


d(T) —d(S)= XO (aj + zij) 20. 


jETNS: 
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The core of a convex game is nonempty ([6]), thus C(I’) 4 0. By Lemma 2, we 
have the following theorem, 


Theorem 1. l = (N,v) has a nonempty core. 


Now let’s find one of the payoff allocations that are in the core. For any pair of 
players (i, j), suppose the payoff allocation of the game between i and j is (%;;, ĉji). 
Then we have the following 


Corollary 1. /f the payoff allocation satisfies ĉi; > 0 and <j; > 0, then the payoff 
allocation ĉi = >) jey, ti; is in the core C(I’). 


Proof. Take an arbitrary subset S C N, 


ics i, jES iES JES i jES 











the inequality holds because ĉ;; > 0, Vi, j € N. 





Because we only consider transferable utility games, ĉij + ji = ij + tji > 0. 
Therefore (@;,;,@ ;) could be constructed in the following way: 


Vig if x4; > 0, © 54 = 0 
Li; = 4 Tij + Vij ji if Lig < 0, Tji > 0 
(1 _ AlE if Tij > 0, Lyi < 0 


where 0 < A;j = Aji < 1, and ĉ;; > 0 is achieved by carefully choosing ,;. 
Obviously, the payoff allocation we provided in Corollary 1 is a set of points in 
the core, while there generally exist more points in the core that are not covered in 
the Corollary. However, this solution indicates a way to encourage cooperation in the 
whole network. The players that have positive gain can negotiate with their neigh- 
bors by sacrificing certain gain (offering their partial gain Ax ;;). Though they cannot 
achieve their best possible payoff, they can set up a cooperative relation with their 
neighbors. This is definitely beneficial for the players who negotiate and sacrifice, 
since without cooperation they cannot get anything. This solution is also efficient 
and scalable, because players only need to negotiate with their direct neighbors. 
Thus we established cooperative games among nodes in the network, and de- 
scribed an efficient way to achieve cooperation throughout the network. In the next 
section, we are going to discuss solutions by employing trust mechanisms, which do 
not require negotiation and the assumption on zij + £ji > 0 can also be relaxed. 


11.3.2 Trust mechanism 


Trust is a useful incentive for encouraging nodes to collaborate. Nodes who re- 
frain from cooperation get lower trust value and will be eventually penalized because 
other nodes tend to only cooperate with highly trusted ones. From Fig. 11.1 and the 
corresponding system equations, the trust values of each node will eventually influ- 
ence its payoff. Let’s assume, for node i, that the loss of not cooperating with node j 
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is a nondecreasing function of zji, because the more j loses, the more effort j under- 
takes to reduce the trust value of 7. Denote the loss for 7 being non-cooperative with 
jas lij = f(xji) and f(0) = 0. For simplicity, assume the characteristic function is 
a linear combination of the original payoff and the loss, which is shown as 


(S= X ry- X Flag. (11.11) 
LjES ics jgs 
The game with characteristic function v’ is denoted as 7” (N, v’). We then have 


Theorem 2. If Vi, j, zij + f(£ji) > 0, then C(I") 4 Ú and z; = > 
point in C(I"). 


GEN Tij Wa 


Proof. First we prove I” is a convex game, given zij + f(xji) > 0. We have that 
Vie NinI”, 
d;(S) = D (čij + zji) — 5K Tki) +> F(t). 
GES jG Hi kgs JES, jAi 


Letting S C T, 


di(T)—di(S)= $; (ytra) t+ X Fan)t SD Feu) 


JETS? kETNS¢ jETNS¢ 
= X (wy + fj) + (wy + F(viz))) 2 0. 
jeTNSe 


Therefore C(I”) is nonempty. Next, we verify that z; = )),<,, Tij is in the core. 
For any S E€ N, 


Soa; - (8) => Y ty- X zy- 5 F (xi) 


ics ics JEN i, jES i€S,kES 
= >> (ij + F(ex)) = 0. 
iES, js 














Apparently, the payoff x; = >> JEN; Vij does not need any payoff negotiation. 
Thus we showed that by introducing a trust mechanism, all nodes are induced to 
collaborate with their neighbors without any negotiation. 

In this section, we introduced two approaches that encourage all nodes in the net- 
work to cooperate with each other: 1) negotiation among neighbors; 2) trust mech- 
anism. We proved that both approaches lead to a nonempty core for the cooper- 
ative game played in the network. However, we have only considered these two 
approaches separately, and the results are based on static settings. The more inter- 
esting problems are how these two intertwine and how the dynamics between the 
two approaches converge to a cooperative network — these are discussed in the next 
section. 
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11.4 Dynamics of cooperation 


We have analyzed the effect of a trust mechanism on the formation of cooper- 
ation. However, what we concentrated on in Section 11.3.2 is the final impact of 
trust on the payoffs at the steady state. In this section, we are going to discuss two 
dynamic behaviors in the system: trust propagation and game evolution. 


11.4.1 Trust propagation 


Trust propagation is concerned with how trust evidences (usually negative evi- 
dences) propagate from the victims (those who do not receive desired services from 
their neighbors) and how the trust evidences of a certain node reach its neighborhood 
and trigger off revocation. The consequence of revocation is that the neighbors refuse 
to cooperate with the poorly-trusted node and finally isolate it. 

Our model is motivated by considering a group of agents each of whom must 
decide between two alternative actions (trust or distrust a certain node), and whose 
decisions depend explicitly on the actions of other members of the group. Appar- 
ently, the other members are those who are interacting with the agent. In economic 
terms, this entire class of problems is known generically as binary decisions with ex- 
ternalities. Though it appears as a very simple binary decision problem, it is relevant 
to surprisingly complex problems, such as statistical mechanics. The decision rule in 
our model is basically a threshold rule. Agents are usually reluctant to switch their 
decisions, because decisions usually require more resources and time. But once their 
individual threshold has been reached, even a single evidence can trigger them into 
switching from one state to another. Our decision rule, which is particularly simple, 
while capturing the essential features outlined above, is the following. Every node 
keeps a state that represents its opinion on a particular node, say node 7: 0 stands 
for distrust and 1 stands for trust. Suppose initially all nodes have the state 1, i.e., 
nodes first trust all others, but the state immediately changes if the node observes 
non-cooperation of the particular node 7. We model the system evolving in discrete 
time. At each time step, a node observes the current states (either 0 or 1) of other 
nodes it interacts with, which we call its neighbors. The node adopts state 0 if at 
least a threshold fraction ¢ of its k neighbors are in state 0, else it adopts state 1. 

Because of the differences in knowledge, preferences and observational capabili- 
ties across the nodes, the threshold ¢ is allowed to be heterogeneous. ¢ is determined 
by the individual node, and can be modeled as drawn at random from a pre-defined 
distribution with pdf f(¢). As we have discussed, to model the dynamics of the re- 
vocation, the states of all nodes are initially set to 1. At a certain time, the non- 
cooperative behavior of node 7 is observed, then a fraction (usually very small, be- 
cause the network is sparse) of the nodes are switched to state 0. The whole network 
evolves at successive time steps, with all nodes updating their states in asynchronous 
order according to the threshold decision rule above. Once a node has switched to 
state 0, it remains at 0 for the rest of the dynamics. 

The main objective of trust propagation is to explore how the trust revocation 
depends on the network interactions. Because building relationships and exchanging 
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information are both costly, especially for wireless ad hoc networks, the interactions 
tend to be very sparse, so we consider only the properties of networks with low 
(node) degree. Our approach concentrates on two quantities: (i) the probability that 
the revocation is accepted by a sufficiently large portion of the network (or a finite 
fraction for an infinite network) triggered from a single node (or small fraction of 
nodes) — we call these phenomena global revocation; and (ii) the expected size of 
the global revocation. 





D (average degree) 


Global revocation 














0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 
o 


Fig. 11.3. Revocation windows for the threshold model. The network is a random graph based 
on Erdos and Rényi [4]. 


Fig. 11.3 graphically shows the condition for global revocation. For simplicity, 
we assume homogeneity, i.e., the threshold ¢ is the same for every node. The average 
(node) degree of the network [12] is given by d. The line encloses the region of the 
(, d) plane in which a large fraction (80%) of the network nodes accept the revo- 
cation. Fig. 11.4 illustrates that the fraction of nodes accepting revocation changes 
with the threshold @, with fixed average (node) degree. 

The phase transitions in Fig. 11.4 define the boundaries of the revocation win- 
dows. The exact solutions for the phase transitions are discussed in [15], which also 
provides the comparison of different network topologies. Therefore, the network 
topology and threshold value are crucial parameters for global revocation. This gives 
an important indication and reference for network management and decision control 
in sparse networks, where agents interact and make decisions based on information 
provided by their neighbors, and in collaboration with their neighbors. 
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Fig. 11.4. Percentage of nodes accepting revocation vs. threshold ¢, d = 6. 


11.4.2 Game evolution 


As shown in Sect. 11.3.2, the trust mechanism drives selfish nodes to sacrifice 
part of their benefits and thus promotes cooperation. In this section, the procedure 
and dynamics of such cooperation evolution are studied. 

In this section, we assume that nodes either cooperate or do not cooperate with 
neighbors. yi; = 1 denotes that node 7 cooperates with its neighbor j, and yi; = 0 
denotes that it does not cooperate with 7. We assume that the payoff when one of 
them does not cooperate is fixed as (0,0), and as (x;,;,2;;) when both cooperate. If 
Tij < 0, we call the link (i, j) a negative link for node i, and when the opposite holds 
a positive link. Since all nodes are selfish, nodes tend to cooperate with neighbors that 
are on positive links, while they do not wish to cooperate with neighbors on negative 
links. Meanwhile, the trust mechanism is employed, which aims to function as the 
incentive for cooperation. In this part, we assume that revocation and nullification of 
revocation can propagate throughout the network as discussed in Section 11.4.1. 

In our evolution algorithm, each node maintains a record of its past experience 
by using the variable A,(t). First define x,,;(¢) as the payoff i gains at time t and 
Xe i(t) as the expected payoff i can get at time t if i always chooses cooperation with 
all neighbors. Notice that the expected payoff can be different each time, since it de- 
pends on whether the neighbors cooperate or not at the specific time. Then compute 
the cumulative difference, 


Ailt) = A;(t = 1) + (£a alt) = Lei (t)) 5 (11.12) 


of the total payoff in the past minus the expected payoff if the node always cooper- 
ates. Each node chooses its strategy on the negative links by the following rule: 
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e if A;(t) < 0, node i chooses to cooperate, i.e., yj; = 1,V7 € Ni. 
° if A;(t) > 0, Vij =0,if 7 € N; and zij <1. 


Notice that at time 0, A;(0) = 0. That is to say initially all nodes choose not to 
cooperate on the negative links, since they are inherently selfish. There are two other 
conditions that force non-cooperation strategies: 


e nodes do not cooperate with neighbors that have been revoked. 
e nodes do not cooperate with non-cooperative neighbors. 


To summarize, as long as one of those aforementioned conditions is satisfied, nodes 
choose not to cooperate. 

Since we allow and encourage nodes to rectify, i.e., to change their strategies 
from non-cooperation to cooperation, we define a temporal threshold 7 in the trust 
propagation rule. Instead of always keeping 0O once the state is switched to 0, as 
in Section 11.4.1, we allow the nullification of revocation (switch back to state 1) 
under the condition that the revocation has been nullified for 7 consecutive time 
steps. T also represents the penalty for being non-cooperative. T needs to be large so 
that the non-cooperative nodes would rather switch to cooperate than get penalized. 
However, large 7 will also reduce the payoff. 

The detailed algorithm is shown in Fig. 11.5. 

Suppose the total payoff of node i, if every node cooperates, is x; = >> JEN; Vij 
We have the following 


Theorem 3. Vi € N and x; > 0, there exists To, such that for a fixed T > To: 


1. The iterated game converges to Nash equilibrium. 
2. A;(t)/t + 0 as t > oo. 
3. i cooperates with all its neighbors for t large enough. 


Proof. Nodes without negative links, will always cooperate, thus A; = 0. Therefore, 
we only consider nodes with negative links. First we prove that for t large enough 


Aj;(t) < 0. Define for node i, the absolute sum of positive payoffs and negative 
(p) (n) 


payoffs as x; ` and x; ° respectively. Then 


Ti = rP) — ol”), 

Therefore the first payoff for node i is £a (1) = a) > 0 and A; = a”, Define 
Tmax as the maximum propagation delay in the network. Then at t = Tmax all it’s 
neighbors revoke 7 because at time t = 1, ¿ didn’t cooperate, and the payoff now is 
La,i(Imax) = 0 and A;(Tnar) = Ai(Tmaxz — 1) — zi. i continues to get 0 payoff 
till all neighboring nodes have used the penalty interval 7. It’s easy to show that as T 
is set large enough, i eventually gets negative A;. 

If z follows the strategy rules in Fig. 11.5, 7 starts to cooperate with all neighbors. 
The difference of the actual payoff and expected payoff is 0 from then on. Therefore 
A;(t)/t > 0 as t > oo. 

Assume node 7 deviates to non-cooperation, then it will get negative cumulative 
payoff difference as discussed above. So node ¿ has no intention to deviate from 
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Consider node 2, and the initial settings are as follows: 


e all the trust states are set to s;; = 1,Vj7 € N; 
e the variable A;(0) = 0. 


Node ¿ chooses strategies and updates variables in each time step for t = 
i a 


1. The strategy on the game with neighbor j is set according to the following 
rule: 
e for negative links (7;; < 0), choose non-cooperation strategy (yi; = 0) 
if A;(t — 1) > 0; 
sk if Sij = 0, Vig = 0; 
e for all neighbors, yi; = 0 iff yji = 0 (cooperation is bilateral); 
e otherwise yi; = 1. 
2. For all 7 € M;, update the trust state s; j if one of the following three condi- 
tions is satisfied, otherwise keep the previous state 
e ifi accepts a revocation on node j, sij = 0; 
e ifthe revocation has been nullified for more than 7 consecutive steps, set 
Sij = ate 
° if yj; = 0, set si; = 0; 
3. Compute the actual payoff x, ;(t) and expected payoff xe (t), then get the 
cumulative difference A;(t) by Eqn.( 11.12). 











Fig. 11.5. Algorithm for game evolution modeling trust revocation. 


cooperation. Therefore the game converges to its Nash equilibrium with all nodes 
cooperating. 














We have also performed simulation experiments with our evolution algorithm. 
In the simulations, we didn’t assume the condition that Vi, x; > 0, instead the per- 
centage of negative links is the simulation parameter. We can report that without 
this condition, our iterated game with the trust scheme can still achieve very good 
performance. Fig. 11.6 shows that cooperation is highly promoted under the trust 
mechanism. In Fig. 11.7, the average payoffs between the algorithm with strategy 
update and without strategy update are compared, which explains the reason why 
nodes converge to cooperation. 


11.5 Conclusions and Future Directions 


In this chapter we investigated fundamental methods by which collaboration in 
infrastructure-less wireless networks with mobile nodes can be induced, analyzed 
and evaluated. In this chapter we have also described a new framework within which 
the problem of distributed trust establishment and maintenance in a mobile ad hoc 
network (MANET) can be formulated and analyzed. 
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Fig. 11.6. Percentage of cooperating pairs vs. negative links. 
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We concentrated only on distributed methods that use local interactions. We de- 
veloped and analyzed a cooperative game framework first and demonstrated how 
collaboration can be induced. We showed that negotiation between the mobile agents 
is an important component for achieving collaboration within this framework. We 
next developed a model for establishing, propagating and managing trust within a 
MANET. We showed that such trust mechanisms can also establish collaboration, 
even without negotiations between the mobile agents. Finally we investigated both 
the dynamics of games as well as of trust propagation as a means for quantifying the 
degree of collaboration achieved among the agents and of the speed by which this 
collaboration spreads in a large part of the network agents. In the context of our re- 
search reported here, we have drawn inspiration from analytical methods used in sta- 
tistical mechanics investigations of the Ising model and spin glasses. these analogies 
include the existence and investigation of phenomena analogous to phase transitions. 

Important current and future directions of our research program are the evaluation 
of the robustness of these mechanisms for collaboration in wireless networks, analy- 
sis of their reliability and identification of parameters (including topology types) that 
influence the dynamics and the qualities of the induced collaborative behavior. 
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Summary. Motivated by resource allocation problems in communication networks as well 
as power systems, we consider the design of market mechanisms for such settings which are 
robust to gaming behavior by market participants. Recent results in this work are reviewed, 
including: (1) efficiency loss guarantees for a data rate allocation mechanism first proposed 
by Kelly, both when link capacities are fixed and when they are elastic; (2) characterization of 
mechanisms that minimize the efficiency loss, within a certain class of “simple” mechanisms; 
(3) extensions to general networks; and (4) mechanism design for supply function bidding in 
electric power systems. 


12.1 Introduction 


This chapter addresses a problem at the nexus of engineering, computer science, 
and economics: in large scale, decentralized systems, how can we efficiently allocate 
scarce resources among competing interests? On one hand, constraints are imposed 
on the system designer by the inherent architecture of any large scale system. These 
constraints are counterbalanced by the need to design mechanisms that efficiently 
allocate resources, even when the system is being used by participants who have 
only their own interests at stake. 

We consider two main classes of resource allocation problems. First, we consider 
a setting where a resource in scarce supply must be allocated among multiple com- 
peting consumers. Second, we discuss a setting where multiple producers compete 
to satisfy a fixed demand. The former model is motivated by applications to commu- 
nication networks, while the latter is motivated by electric power market design. 

What goals might we have for markets in such settings? We would of course like 
the equilibria of mechanisms designed for such settings to be “desirable;” a com- 
mon requirement is that equilibria should be Pareto efficient. In other contexts, we 
want the equilibria to satisfy a predetermined notion of fairness; or we may wish 
the resulting vector of monetary transfers to satisfy certain properties, such as profit 
maximization for the market operator. Beyond such constraints on the properties at 
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equilibrium, however, we are also concerned with the complexity of such mecha- 
nisms. In particular, we may desire mechanisms which have relatively low informa- 
tion overhead: the strategy spaces of the players should be “simple,” and the feedback 
from the market to the players should be “simple” as well. Often, such complexity 
issues arise in a discussion of the dynamic behavior of market mechanisms, in trying 
to determine whether equilibria are actually achieved over time by players. 

In this chapter, we will focus on efficiency of mechanisms which maintain low 
complexity, appropriately defined. We focus on efficiency primarily as a first test of 
feasibility. Traditionally, economics has focused on selection of efficient mechanisms 
because mechanisms with inefficient equilibria are less likely to be useful in practice. 
Indeed, the classical theory of mechanism design is largely devoted to determining 
when fully efficient equilibria can be guaranteed (see, e.g., Chapter 23 of [23] for an 
overview). 

The landmark contribution of mechanism design is the Vickrey—Clarke—Groves 
class of mechanisms, which guarantee efficient allocations at dominant strategy equi- 
libria [4, 11, 32]; unfortunately, implementing VCG mechanisms is generally a very 
complex proposition with many possible pitfalls [2, 27]. The task is further com- 
plicated by the fact that the VCG class of mechanisms are essentially the only class 
which guarantee fully efficient outcomes as dominant strategy equilibria [8]. Thus, to 
make progress, the notion of equilibrium must be weakened, and/or some efficiency 
must be lost. Previous results in the economics literature have considered weakening 
the notion of equilibrium; for example, Maskin has shown that if we only consider 
Nash equilibria, efficiency can be guaranteed if certain conditions are satisfied by 
players’ characteristics [24]. However, no guidance is available as to how to design 
such mechanisms with low complexity. 

In this chapter we weaken the requirement of full efficiency. The basic technique 
we consider is one of restricting the strategy spaces of the players (either buyers or 
sellers). With the proper choice of restriction, we can achieve two goals simultane- 
ously. First, by ensuring that strategy spaces are relatively simple, we can restrict 
attention to mechanisms with low complexity. Second, if strategies of players are 
restricted, we can reduce their opportunities to game the system; this will lead to 
provable bounds on efficiency loss at Nash equilibria. 

In the remainder of the chapter, we provide an overview of the progress made in 
our earlier work [15-17]. In Section 12.2, we consider a setting of multiple con- 
sumers and inelastic supply, motivated by rate allocation in communication net- 
works. For a single link of fixed capacity, we investigate a resource allocation mech- 
anism proposed by Kelly [18]. Network users choose bids, which denote the total 
amount they are willing to pay. A price is then chosen to clear the market; for the 
case of a single link, this allocation mechanism allocates fractions of the resource to 
the users in proportion to their bids. Kelly has previously shown that if users are price 
taking—that is, if they do not anticipate the effects of their actions on the market- 
clearing price—the resulting competitive equilibrium allocation is fully efficient. Our 
key result in this section is that when users are price anticipating, aggregate utility 
falls by no more than 25% relative to the maximum possible. 
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In Section 12.3, we consider the same basic mechanism as in Section 12.2, but 
now consider a setting where supply is elastic; this is the model considered by Kelly 
et al. [19]. In this case the link is characterized by a cost depending on the total al- 
located rate, rather than a fixed capacity. Again, Kelly et al. have previously shown 
that if users are price taking, this mechanism maximizes aggregate surplus (i.e., ag- 
gregate utility minus cost). For this setting we establish that when users are price 
anticipating, aggregate surplus falls by no more than approximately 34% relative to 
the maximum possible. 

Sections 12.2 and 12.3 establish efficiency loss results for a specific market 
mechanism. In Section 12.4, we characterize the mechanism studied in Section 12.2 
as the “best” choice of mechanism under reasonable assumptions. Formally, we show 
that in a class of market-clearing mechanisms satisfying certain simple mathemati- 
cal assumptions and for which there exist fully efficient competitive equilibria, the 
mechanism of Section 12.2 uniquely minimizes efficiency loss when market partici- 
pants are price anticipating. These results justify the attention devoted to understand- 
ing the particular market mechanism studied in Sections 12.2 and 12.3; furthermore, 
they clearly delineate conditions which must be violated if we hope to achieve higher 
efficiency guarantees than those provided by the results of Sections 12.2 and 12.3. 

In Section 12.5, we summarize two further directions of research. First, in Sec- 
tion 12.5.1, we discuss the generalization of the models of Sections 12.2 and 12.3 
to networks with arbitrary topology. We consider games where users submit individ- 
ual bids to each link in the network. Such games are then proven to have the same 
efficiency loss guarantees as the single link games considered in Sections 12.2 and 
12:3: 

Next, in Section 12.5.2, motivated by power systems, we discuss a setting where 
multiple producers bid to satisfy an inelastic demand D. We consider a market 
mechanism where producers submit supply functions restricted to lie in a certain 
one-parameter family, and a market-clearing price is chosen to ensure that aggregate 
supply is equal to the inelastic demand. We establish that when producers are price 
anticipating, aggregate production cost rises by no more than a factor 1 + 1/(N — 2) 
relative to the minimum possible production cost, where N > 2 is the number of 
firms competing. Finally, we conclude with some open issues in Section 12.6. 


12.2 Multiple Consumers, Inelastic Supply 


Suppose R users share a communication link of capacity C > 0. Let d, denote 
the rate allocated to user r. We assume that user r receives a utility equal to U, (dr) 
if the allocated rate is d,.; we assume that utility is measured in monetary units. We 
make the following assumptions on the utility function. 


Assumption 1. For each r, over the domain d, > 0 the utility function U,(d,) is 
concave, strictly increasing, and continuous; and over the domain dy > 0, U,(d,-) 
is continuously differentiable. Furthermore, the right directional derivative at 0, de- 


noted U!(0), is finite. 
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Given complete knowledge and centralized control of the system, it would be 
natural for the link manager to try to solve the following optimization problem [18]: 


maximize De, U,(dr) (12.1) 
subject to 5 d, < C; (12.2) 
d.>0, r=1,...,R. (12.3) 


Note that the objective function of this problem is the aggregate utility. Since the ob- 
jective function is continuous and the feasible region is compact, an optimal solution 
d = (dj,...,dR) exists. If the functions U, are strictly concave, then the optimal 
solution is unique, since the feasible region is convex. 

In general, the utility functions are not available to the link manager. As a re- 
sult, we consider the following pricing scheme for rate allocation. Each user r sub- 
mits a payment (also called a bid) w, to the link manager; we assume w, > 0. 
Given the vector w = (wy,...,w,), the link manager chooses a rate allocation 
d = (d,...,d,). We assume the manager treats all users alike—in other words, the 
link manager does not price discriminate. Each user is charged the same price u > 0, 
leading to d, = wr/u. We further assume the manager always seeks to allocate the 
entire link capacity C’; in this case, we expect the price p to satisfy: 


> =0. 


~ H 
The preceding equality can only be satisfied if X`, w, > 0, in which case we have: 


Da Wr 
SE 
In other words, if the manager chooses to allocate the entire available rate at the link, 
and does not price discriminate between users, then for every nonzero w there is a 
unique possible price u > 0, given by the previous equation. 

We can interpret this mechanism as a market-clearing process by which a price 
is set so that demand equals supply. To see this interpretation, note that when a user 
submits a total payment w,, it is as if the user has submitted a demand function 
D(p,wr) = w,/p for p > 0. The demand function describes the rate that the user 
demands at any given price p > 0. The link manager then chooses a price ju so that 
>. D(u, wr) = C, i.e., so that the aggregate demand equals the supply C. For the 
specific form of demand functions we consider here, this leads to the expression for u 
given in (12.4). User r then receives a rate allocation given by D(,1, wp), and makes 
a payment uD(u, wr) = wr. This interpretation of the mechanism we consider here 
will be further explored in Section 12.4, where we consider other market-clearing 
mechanisms with the users submitting demand functions from a family parametrized 
by a single scalar. 

In the remainder of the section, we consider two different models for how users 
might interact with this price mechanism. In Section 12.2.1, we consider a model 


(12.4) 
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where users do not anticipate the effect of their bids on the price, and provide a 
result, due to Kelly [18], on the existence of a competitive equilibrium. Furthermore, 
this competitive equilibrium leads to an allocation which is an optimal solution to 
(12.1)-(12.3). In Section 12.2.2, we change the model and assume users are price 
anticipating, and provide a result (due to Hajek and Gopalakrishnan [12]) on the 
existence and uniqueness of a Nash equilibrium. In Section 12.2.3, we then consider 
the loss of efficiency at this Nash equilibrium, relative to the optimal solution to 
(12.1)-(12.3). 


12.2.1 Price taking users and competitive equilibrium 


In this section, we consider a competitive equilibrium between the users and the 
link manager [23], following the development of Kelly [18]. A central assumption 
in the definition of competitive equilibrium is that each user does not anticipate the 
effect of their payment w, on the price u, i.e., each user acts as a price taker. In this 
case, given a price u > 0, user r acts to maximize the following payoff function over 
Wp = 0: 


P,(wp; p) = U, (=) — Wp. (12.5) 


The first term represents the utility to user r of receiving a rate allocation equal to 
wr / u; the second term is the payment wy made to the manager. Observe that since 
utility is measured in monetary units, the payoff is quasilinear in money [23]. 

We say that a pair (w, u), with w > 0 and p > 0, is a competitive equilibrium if 
users maximize their payoff as defined in (12.5), and the network “clears the market” 
by setting the price u according to (12.4): 





P,(wr; ) > P sa for, >0, r=1,...,R; (12.6) 
5 Wr 

= = . 12.7 

C Ge) 


Kelly shows in [18] that when users are price takers, there exists a competitive equi- 
librium, and the resulting allocation is an optimal solution to (12.1)-(12.3). This is 
formalized in the following theorem, adapted from [18]. 


Theorem 1 (Kelly, [18]). Suppose that Assumption 1 holds. Then there exists a com- 
petitive equilibrium, i.e., a vector w = (WwW ,...,WR) > 0 anda scalar p > 0 
satisfying (12.6)—(12.7). 

In this case, the scalar p is uniquely determined, and the vector d = w/ 1 is an 
optimal solution to (12.1)—(12.3). If the functions U, are strictly concave, then w is 
uniquely determined as well. 


Theorem 1 shows that under the assumption that the users of the link behave as 
price takers, there exists a bid vector w where all users have optimally chosen their 
bids w,, with respect to the given price u = )¢,.w,/C; and at this “equilibrium,” 
aggregate utility is maximized. However, when the price taking assumption is vio- 
lated, the model changes into a game and the guarantee of Theorem | is no longer 
valid. 
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12.2.2 Price anticipating users and Nash equilibrium 


We now consider an alternative model where the users of a single link are price 
anticipating, rather than price takers. The key difference is that while the payoff 
function P, takes the price u as a fixed parameter in (12.5), price anticipating users 
will realize that jz is set according to (12.4), and adjust their payoff accordingly; this 
makes the model a game between the R players. 

We use the notation w_,. to denote the vector of all bids by users other than r; 
i.e., W_, = (w1, W2,...,Wr—1, Wr41;---,WR). Given w_,., each user r chooses 
wr, to maximize: 


U, (=°) — wp, if wp > 0; 
Qn (tes Wor) = dus Ws (12.8) 


U,(0), if wr =0 


over all nonnegative w,. The second condition is required so that the rate alloca- 
tion to user r is zero when wp = 0, even if all other users choose w_,. so that 
ye tp Ws = 0. The payoff function Q, is similar to the payoff function P,., except 
that the user anticipates that the network will set the price u according to (12.4). A 
Nash equilibrium of the game defined by (Q1, ..., Qpr) is a vector w > 0 such that 
for all r: 

Q,(w,; W_,) > Q,(W,;w_,), forall w, > 0. (12.9) 


Hajek and Gopalakrishnan have shown that there exists a unique Nash equilib- 
rium when multiple users share the link, by showing that at a Nash equilibrium it 
is as if the users are solving another optimization problem of the same form as the 
problem (12.1)—(12.3), but with “modified” utility functions. This is formalized in 
the following theorem, adapted from [12]. 


Theorem 2 (Hajek and Gopalakrishnan, [12]). Suppose that R > 1, and that As- 
sumption I holds. Then there exists a unique Nash equilibrium w > 0 of the game 


defined by (Q1, .. . , Qr), and it satisfies X „wr > 0. 
In this case, the vector d defined by: 


Wr 
d, = C, r=1,...,R, 12.10 
D r ( ) 


is the unique optimal solution to the following optimization problem: 





maximize 5 Û, (dr) (12.11) 
subject to X` d, < C; (12.12) 
d.>0, r=1,...,R, (12.13) 


where 
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a= (1 - a) U,(dy) + (4) ef U-(2) i) (12.14) 


Theorem 2 shows that the unique Nash equilibrium of the game is character- 
ized as the solution to the optimization problem above. Other games have also prof- 
ited from such relationships—notably traffic routing games, in which Nash—Wardrop 
equilibria can be found as solutions to a related global optimization problem. Rough- 
garden and Tardos use this fact to their advantage in computing efficiency loss for 
such games [28]; Correa, Schulz, and Stier Moses also use this relationship to con- 
sider routing games in capacitated networks [5]. Finally, we note that for the game 
presented here, several authors have derived results similar to Theorem 2 [7, 21, 22], 
though not as general. 


12.2.3 Efficiency loss 


We let d° denote an optimal solution to (12.1)-(12.3), and let dË denote the 
unique optimal solution to (12.11)-(12.13). We now investigate the efficiency loss 
of this system; that is, the utility loss caused by the price anticipating behavior of 
the users. More precisely, we will compare the utility X2, U;-(d@) obtained when the 
users fully evaluate the effect of their actions on the price, and the maximum possible 
aggregate utility $`, U, (d$). (We know, of course, that >, U;.(d@) < >, U, (d$), 
by definition of dê.) According to the following theorem, the worst case efficiency 
loss is exactly 25%; the proof may be found in [17]. 


Theorem 3. Suppose that R > 1, and that Assumption 1 holds. Suppose also that 
U,(0) > 0 for all r. If AÙ is any optimal solution to (12.1)-(12.3), and dÎ is the 
unique optimal solution to (12.11)—(12.13), then: 


Eo Do 


Furthermore, this bound is tight: for every £ > 0, there exists a choice of R, and a 
choice of (linear) utility functions U,, r = 1,..., R, such that: 


Dus) < (F +e) (= a) . 


We provide some comments on the method for proving a result such as Theorem 
3. The first step is to show that the worst case efficiency loss occurs when the utility 
functions belong to a certain finite-dimensional family; in the current context, it is the 
family of linear utility functions. Identifying the worst case utility functions amounts 
to minimizing an efficiency measure over all possible choices of the coefficients 
of the linear utility functions. It turns out that this minimization can be cast as a 
sequence of finite-dimensional nonlinear optimization problems (each problem in 
the sequence corresponding to a different number R of users), which can be studied 
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analytically. In the context of Theorem 3, the worst efficiency loss corresponds to a 
link of capacity 1, where user 1 has utility Uj(d,) = dj, and all other users have 
utility U;(d,) ~ d,/2. As R — oo, at the Nash equilibrium of the game, user 1 
receives a rate d? = 1/2, while the remaining users uniformly split the rate 1—d¢ = 
1/2 among themselves, yielding an aggregate utility of 3/4. 

We note that a similar bound was observed by Roughgarden and Tardos for traffic 
routing games with affine link latency functions [28]. They found that the ratio of 
worst case Nash equilibrium cost to optimal cost was 4/3. However, it is questionable 
whether a relationship can be drawn between the two games; in particular, we note 
that while Theorem 3 holds even if the utility functions are nonlinear, Roughgarden 
and Tardos have shown that the efficiency loss due to selfish users in traffic routing 
may be arbitrarily high if link latency functions are nonlinear. 


12.3 Multiple Consumers, Elastic Supply 


In this section, we allow the supply of the scarce resource to be elastic, rather 
than fixed as in the previous section. Rather than being characterized by a capacity, 
we will characterize the resource through a cost function that gives the cost incurred 
by the resource as a function of the flow through it. We continue to assume that R 
users share a single communication link, and that user r receives a utility U, (d+) 
if the allocated rate is d,. We let f = }°,.d, denote the total rate allocated at the 
link, and let C (f) denote the cost incurred at the link when the total allocated rate 
is f > 0. We will assume that both U, and C are measured in the same monetary 
units. A natural interpretation is that U, (d, ) is the monetary value to user r of a rate 
allocation dy, and C(f) is a monetary cost for congestion at the link when the total 
allocated rate is f. 

We continue to assume the utility functions U, satisfy Assumption 1. In addition, 
we make the following assumption on the cost function C. 


Assumption 2. There exists a continuous, convex, strictly increasing function p( f) 
over f > 0 with p(0) = 0, such that for f > 0: 


F 
olf) = f p(2)dz. 


Thus C (f) is strictly convex and strictly increasing. 


Given complete knowledge and centralized control of the system, it would be 
natural for the link manager to try to solve the following optimization problem[18]: 


maximize X` U,(dy) — C (= i) (12.15) 
subject to d, > 0, P= Lease dt: (12.16) 


We refer to the objective function (12.15) as the aggregate surplus. This is the net 
monetary benefit to the economy consisting of the users and the single link. Since the 
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objective function is continuous, and U, increases at most linearly, while C increases 
superlinearly, an optimal solution df = (d?,...,d}) exists; since the feasible re- 
gion is convex and C is strictly convex, if the functions U, are strictly concave, then 
the optimal solution is unique. 

We consider the following pricing scheme for rate allocation, a natural analogue 
of the mechanism presented in Section 12.2. Each user r submits a payment (or 
bid) of w, to the resource manager. Given the composite vector w = (w1,..., Wr), 
the resource manager chooses a rate allocation d(w) = (di(w),...,dr(w)). We 
assume the manager treats all users alike—in other words, the network manager 
does not price differentiate. Thus the network manager sets a single price u(w); we 
assume that u(w) = 0 if w, = 0 for all r, and p(w) > 0 otherwise. All users are 
then charged the same price ju(w), leading to: 


0, if w, = 0; 
if w, > 0. 


Notice that, with this formulation, the rate allocated to user r is similar to the rate 
allocated to user r in the model of Section 12.2. The key difference in this setting 
is that the aggregate rate is not constrained to an inelastic supply; rather, associated 
with the choice of price u(w) is an aggregate rate function f(w), defined by: 


0, if >. wr = 0; 


f(w) =} d (w) = Sam. (12.17) 
r ay if ye wr > 0. 





Let us assume for now that given a price u > 0, user r wishes to maximize the 
following payoff function over w, > 0: 


P(w p) = U, (=) — Wp. (12.18) 


The first term represents the utility to user r of receiving a rate allocation equal to 
wr / u; the second term is the payment w, made to the manager. 

Notice that as formulated above, the payoff function P, assumes that user r acts 
as a price taker; that is, user r does not anticipate the effect of his choice of w, on the 
price u, and hence on his resulting rate allocation d,.(w). Informally, we expect that 
in such a situation the aggregate surplus will be maximized if the network manager 
sets a price equal to marginal cost; that is, if the price function satisfies: 


u(w) = p(f(w)). (12.19) 


The well-posedness of such a pricing mechanism is the subject of the following 
proposition. 


Proposition 1. Suppose Assumption 2 holds. Given any vector of bids w > 0, there 
exists a unique pair (u(w), f(w)) > 0 satisfying (12.17) and (12.19), and in this 
case f(w) is the unique solution f to: 
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Sw, = fpf). (12.20) 


Furthermore, f(-) has the following properties: (1) f(0) = 0; (2) f(w) is contin- 
uous for w > 0; (3) f(w) is a strictly increasing and strictly concave function of 
X wr; and (4) f(w) > co as 0. wp > œ. 


Observe that we can view (12.20) as a market-clearing process. Given the total 
revenue „Wr from the users, the link manager chooses an aggregate rate f(w) 
so that the revenue is exactly equal to the aggregate charge f(w)p(f(w)). Due to 
Assumption 2, this market-clearing aggregate rate is uniquely determined. Kelly et 
al. present two algorithms in [19] which amount to dynamic processes of market- 
clearing; as a result, a key motivation for the mechanism we study in this section is 
that it represents the equilibrium behavior of the algorithms in [19]. 

For the remainder of this section, we will assume that ju(w) is set according to 
the choice prescribed in Proposition 1, as follows. 


Assumption 3. For all w > 0, the aggregate rate f(w) is the unique solution to 
(12.20): X`, wr = f(w)p(f(w)). Furthermore, for each r, d,(w) is given by: 


0, if wr = 0; 


d,(w) = (12.21) 
if wr > 0. 


Note that we have f(w) > 0 and p(f(w)) > 0 if $Z wr > 0, and hence d, is 
always well defined. 

In the remainder of this section, we consider two different models for how users 
might interact with this price mechanism. In Section 12.3.1, we consider a model 
where users do not anticipate the effect of their bids on the price, in which case there 
exists a competitive equilibrium. Furthermore, this competitive equilibrium leads to 
an allocation which is an optimal solution to (12.15)—(12.16). In Section 12.3.2, we 
change the model and assume users are price anticipating, in which case there exists 
a Nash equilibrium. Finally, Section 12.3.3 considers the loss of efficiency at Nash 
equilibria, relative to the optimal solution to (12.15)—(12.16). 


12.3.1 Price taking users and competitive equilibrium 


Kelly et al. show in [19] that when users are price takers, and the network sets the 
price (w) according to (12.17) and (12.19), the resulting allocation is an optimal 
solution to (12.15)-(12.16). This is formalized in the following theorem, adapted 
from [19]. 


Theorem 4 (Kelly et al., [19]). Suppose Assumptions 1, 2, and 3 hold. Then there 
exists a vector w such that p(w) > 0, and: 


P,(wr; u(w)) = max P,(Wr;p(w)), r=1,...,R. (12.22) 
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For any such vector w, the vector d(w) = w/(w) is an optimal solution to 
(12.15)-(12.16). If the functions U, are strictly concave, such a vector w is unique 
as well. 


Theorem 4 shows that with an appropriate choice of price function (as determined 
by (12.17) and (12.19)), and under the assumption that the users of the link behave as 
price takers, there exists a bid vector w where all users have optimally chosen their 
bids w,, with respect to the given price j1(w); and at this “equilibrium,” the aggregate 
surplus is maximized. However, when the price taking assumption is violated, the 
model changes into a game and the guarantee of Theorem 4 is no longer valid. 


12.3.2 Price anticipating users and Nash equilibrium 


We now consider an alternative model where the users of a single link are price 
anticipating, rather than price taking, and play a game to acquire a share of the link. 
Throughout the remainder of this section as well as in Section 12.3.3, we will assume 
that the link manager sets the price j4(w) according to the unique choice prescribed 
by Proposition 1, as follows. 

We adopt the notation w_,. to denote the vector of all bids by users other than 
r; i.e., Wop = (W1, W2,..., Wr—1;,Wr+1;:--, WR). Then given w_,, each user r 
chooses w, > 0 to maximize: 


Qr(Wr; Wr) = U,(d;(w)) — Wr, (12.23) 


over nonnegative w,. The payoff function Q, is similar to the payoff function P,., 
except that the user now anticipates that the network will set the price according to 
Assumption 3, as captured by the allocated rate d,(w). A Nash equilibrium of the 
game defined by (Q,,..., Qp) is a vector w > 0 such that for all r: 


Q,(w,; w_,) > Q,(W,;w_,), forall w, > 0. (12.24) 
The proof of the following proposition can be found in [16]. 


Proposition 2. Suppose that Assumptions 1, 2, and 3 hold. Then there exists a Nash 
equilibrium w for the game defined by (Q1,..., QR). 


12.3.3 Efficiency loss 


We let d° denote an optimal solution to (12.15)-(12.16), and let w denote 
any Nash equilibrium of the game defined by (Q1,..., Qr). We now investi- 
gate the associated efficiency loss. In particular, we compare the aggregate surplus 
X, U,(dr(w)) — C (X, d-(w)) obtained when the users fully evaluate the effect of 
their actions on the price, and the aggregate surplus X`, U,(d2) — C(>>,. d$) ob- 
tained by choosing an allocation which maximizes aggregate surplus. According to 
the following theorem, the efficiency loss is no more than approximately 34%, and 
this bound is essentially tight; the proof can be found in [16]. 
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Theorem 5. Suppose that Assumptions 1, 2, and 3 hold. Suppose also that U,.(0) > 
0 for all r. Ifa’ is any optimal solution to (12.15)-(12.16), and w is any Nash 
equilibrium of the game defined by (Q1, ..., Qr), then: 


S°U,(dr(w)) -C (= iw) > (4v2 = 5) (= Ud -C 2 a) . 
” 7 " " 1225) 
In other words, there is no more than approximately a 34% efficiency loss when users 
are price anticipating. 
Furthermore, this bound is tight: for every 6 > 0, there exists a choice of R, a 
choice of (linear) utility functions U,, r = 1,..., R, and a (piecewise linear) price 
function p such that a Nash equilibrium w exists with: 


< (4v2- 5 +0) (= U,(d5) -C (= a) . (12.26) 


Let us remark here that, according to the proof of Theorem 5, the worst possible 
efficiency loss is achieved along a sequence of games where: 


1. The price function p has the following form, with b — oo: 


_ f2-Vv2)f, if0< f <1; 
We Too 


2. The number of users becomes large (R — oo); and 
3. User 1 has linear utility with U1 (d1) = dı, and all other users r have linear 
utility with U, (dr) = a,d,, where a, ~ p(1) = 2 — V2. 


(Note that formally, we must take care that the limits of R — oo and b — oo are 
taken in the correct order; in particular, in the proof we first have R — oo, and then 
b — oo.) In this limit, we find that at the Nash equilibrium the aggregate allocated 
rate is 1, and the Nash equilibrium aggregate surplus converges to 4V2 — 5. 


12.4 A Characterization Theorem 


In this section we revisit the resource allocation problem of Section 12.2, and 
address the following question: can we identify a mechanism that minimizes the effi- 
ciency loss, in the presence of price anticipating users, within a class of mechanisms 
with certain desirable properties? 

Formally, we consider a collection of users bidding to receive a share of a finite, 
infinitely divisible resource of capacity C'. Each user has a utility function U : Rt > 
R* (where Rt = (0, 00)) that satisfies Assumption 1. More specifically, U belongs 
to the set U utility functions defined by 
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U = {U:R* —R* | U is continuous, strictly increasing, concave on [0, 00), 


and continuously differentiable on [0, 00), with U’(0) < co}. 


We let R denote the number of users, and let U = (Ui,...,U) denote the vector 
of utility functions, where U, is the utility function of user r. We call a pair (R, U), 
where R > 1 and U € U®, a utility system; our goal will be to design a resource 
allocation mechanism with attractive efficiency guarantees for all utility systems. 

We assume once more that utility is measured in monetary units; thus, if user r 
receives a rate allocation dy, but must pay wr, his net net payoff is: 


U,(dy) — Wr. 


Given a utility system U € U®, the social objective is to maximize aggregate 
utility, as defined in the problem (12.1)—(12.3); we repeat that problem here, and refer 
to it as the problem SYSTEM(C, R, U), to emphasize that the problem is specified 
by C, R, and the utility system (U). 


R 

maximize X` U,(dr) (12.27) 
r=1 
R 

subject to 5 d, <C; (12.28) 
r=1 

d>0. (12.29) 


We will say that d solves SYSTEM(C, R, U) if d is an optimal solution to (12.27) 
(12.29), given the utility system (R, U). 

In general, the utility system (R,U) is unknown to the mechanism designer, 
so a mechanism must be designed to elicit information from the users. We will 
focus on mechanisms in which each user i submits a demand function, within a 
one-parameter family of admissible demand functions. In particular, each user has a 
one-dimensional strategic variable, denoted by 6;. 


Definition 1. Given C > 0, a smooth market-clearing mechanism for C is a differ- 
entiable function D : (0,00) x [0,00) —> R* such that for all R, and for all nonzero 
0 c (RĦ)E, there exists a unique solution p > 0 to the following equation: 


R 


XC D(p, r) =C. 


r=1 
We let pp(0) denote this solution, and refer to it as the market-clearing price. 


Note that the market-clearing price is undefined if 9 = 0. As we will see below, 
when we formulate a game between consumers for a given mechanism D, we will 
assume that the payoff to all players is —oo if the composite strategy vector is 8 = 0. 
Note that this is slightly different from the definition in Section 12.2, where the 
payoff to a player who submits 6 = 0 is set to zero. We will discuss this distinction 
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further later; we simply note for the moment that it does not affect the results of this 
section. 

Our definition of smooth market-clearing mechanisms generalizes the mecha- 
nism discussed in Section 12.2. We recall that in that development, each user sub- 
mits a demand function of the form D(p, 0) = 0/p, and the link manager chooses a 
price pp(@) to ensure that oy D(p,6,) = C. Thus, for this mechanism, we have 
pp(@) = pee 0,/C if 0 ~ 0. Another related example is provided by D(p, 0) = 
0/,/p; in this case it is straightforward to verify that pp(@) = S 6,/C)*, for 
0 £40. 

We will further restrict attention to a particular class of mechanisms denoted D, 
which we define as follows. 


Definition 2. The class D consists of all functions D(p, 0) such that the following 
conditions are satisfied: 


1. For all C > 0, D is a smooth market-clearing mechanism for C (cf. Definition 
1). 

2. For all C > 0, and for all U, € U, the payoff of a price anticipating user is 
concave; that is, for all R, and for all 0_„ € (R+)*, the function: 


U,(D(pp(0), Or) _ pp(9)D(pp(9), 0r) 


is concave in 0, > 0 if0_, = 0, and concave in 0, > 0 if0_, 4 0. 
3. The demand functions are nonnegative; i.e., for all p > 0 and 8 > 0, D(p, 0) > 
0. 


The first condition requires a mechanism in D to be a smooth market-clearing mecha- 
nism for any C > 0; in particular, the market-clearing price pp(0@) must be uniquely 
defined for any C > 0. (Note that in the notation we suppress the dependence of 
the market-clearing price pp(@) on the capacity C.) The second condition allows us 
to characterize Nash equilibria in terms of only first order conditions; indeed, some 
such assumption needs to be in place in order to guarantee existence of pure strategy 
Nash equilibria [26]. Finally, the third condition is a normalization condition, which 
ensures that a user is never required to supply some quantity of the resource (which 
would be the case if we allowed D(p, 0) < 0). 

In order to state the main result of this section, we must define competitive equi- 
librium and Nash equilibrium. Given a utility system (R,U), a capacity C > 0, 
and a smooth market-clearing mechanism D € D, we say that a nonzero vector 
0 € (R*)* is a competitive equilibrium if p = pp(@) satisfies: 


0, € arg max [U,(D(u, 97) — uD(u,0,)] F Vr. (12.30) 
6,>0 


Similarly, we say that a nonzero vector 0 € (RĦ)® is a Nash equilibrium if: 


6, € arg max Q,-(0,;0_,), Vr, (12.31) 
6,>0 
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where 


Q,(Or; 0_,) — { aa 9,)) — pp(9)D(pp(9), 0+), a is (12.32) 


Notice that the payoff is —oo if the composite strategy vector is 0 = 0, since in this 
case no market-clearing price exists. 

Our interest is in the worst-case ratio of aggregate utility at any Nash equilibrium 
to the optimal value of SYSTEM(C, R,U) (termed the the “price of anarchy” by 
Papadimitriou [25]). Formally, for D € D and a capacity C > 0 we define a constant 
p(C, D) as follows: 


EE, U-(D(pp(8), 6r)) 
a ee) 


d solves SYSTEM(C, R, U) and @ is a Nash cst }. 


R>1,U €U”, 





p(C, D) = nef 


Note that since all U € U are strictly increasing and nonnegative, and C > 0, the 
aggregate utility oy U,(d,-) is strictly positive for any utility system (R, U) and 
any optimal solution d to SYSTEM(C, R, U). However, Nash equilibria may not 
exist for some utility systems (R, U); in this case we set p(C, D) = —oo. 

The following theorem shows that among smooth market-clearing mechanisms 
for which there always exists a fully efficient competitive equilibrium, the mecha- 
nism proposed in Section 12.2 minimizes efficiency loss when users are price antic- 
ipating. The proof can be found in Chapter 5 of [15]. 


Theorem 6. Let D € D be a smooth market-clearing mechanism such that for all 
capacities C > 0 and utility systems (R, U), there exists a competitive equilibrium 
8 such that (D(pp(@),6,),r = 1,...,R) solves SYSTEM(C, R,U). Then for any 
capacity C and utility system (R,U), there exists a unique Nash equilibrium. Fur- 
thermore, p(C, D) < 3/4 for all C > 0 and all D € D, and this bound is met with 
equality if and only if D(p,@) = A@/p for some A > 0. 


Theorem 6 suggests that the best efficiency guarantee we can hope to achieve 
is 75%, if we are restricted to market-clearing mechanisms with scalar strategy 
spaces. A key restriction in the mechanisms we consider is that a single price is 
chosen to clear the market. If the market designer is granted the latitude to price 
discriminate (i.e., to charge a different price to each user), better efficiency guar- 
antees are possible. The most famous mechanisms which ensure such a guarantee 
are the Vickrey—Clarke—Groves class of mechanisms, for which fully efficient dom- 
inant strategy equilibria exist [4, 11, 32]. More recently, in a networking context, 
Sanghavi and Hajek [30] have shown that if users choose their payments (as in the 
Kelly mechanism), but the link manager is allowed to choose the allocation to users 
as an arbitrary function of the payments, it is possible to ensure no worse than a 
13% efficiency loss. Furthermore, Yang and Hajek [34] have shown that if a mech- 
anism allocates resources in proportion to the users’ strategies (i.e., user r receives 
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a fraction 0, / am 0s) of the resource), then by using differentiated pricing, it is 
possible to guarantee arbitrarily small efficiency loss at the Nash equilibrium. The 
mechanisms proposed by both Sanghavi and Hajek [30] as well as Yang and Hajek 
[34] require price discrimination, since the ratio of payment to allocation is not nec- 
essarily identical for all users (as must be the case in the market-clearing mechanisms 
studied here). 


12.5 Further Directions 


In addition to the results outlined above, several additional threads are included 
in this body of research. In this section, we describe two extensions: (1) resource 
allocation in general networks; and (2) a setting of multiple producers competing to 
satisfy an inelastic demand. 


12.5.1 General networks 


The models presented in Sections 12.2 and 12.3 only consider resource allocation 
for a single link. We now consider extensions to the network case, following [16] and 
[17]. We consider networks consisting of a set of links; each user has a set of paths 
available through the network to send traffic, and each path uses a subset of the links. 
In a setting of inelastic supply, each link j is characterized by a fixed capacity C}. 
In a setting of elastic supply, each link j is characterized by a cost function C;(-). 
We continue to assume that each user r receives a utility U,(d,) from a total rate 
allocation d,.; however, note that in a network context d, is the total rate delivered to 
user r across all paths available to user r through the network. 

We extend the single link market mechanisms to multiple links by treating each 
link as a separate market. Thus we consider a game where each user requests service 
from multiple links by submitting an individual bid to each link. Links then allocate 
rates using the same scheme as in the single link model, and each user sends the 
maximum rate possible, given the vector of rates allocated from links in the network. 
Although this definition of the game is natural, we demonstrate that Nash equilibria 
may not exist in the setting of inelastic supply, due to a discontinuity in the payoff 
functions of individual players. (This problem also arises in the single link setting, 
but is irrelevant there as long as at least two players share the link.) To address the 
discontinuity, we extend the strategy space by allowing each user to request a nonzero 
rate without submitting a positive bid to a link, if the total payment made by other 
users at that link is zero; this extension is sufficient to guarantee existence of a Nash 
equilibrium. In the setting of elastic supply, Nash equilibria are always guaranteed 
to exist, without having to extend the strategy space. Finally, we show that in this 
network setting, if link capacities are inelastic then the total utility achieved at any 
Nash equilibrium of the game is no less than 3/4 of the maximum possible aggregate 
utility; and if link supplies are elastic then the aggregate surplus achieved at any Nash 
equilibrium of the game is no less than a factor 4,\/2 — 5 of the maximal aggregate 
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surplus. These results extend the efficiency loss results from the single link setting to 
general networks. 

The mechanisms we have studied require each user to submit a separate bid for 
each link that the user may use. An alternative mechanism had been proposed ear- 
lier by Kelly [18] whereby a user submits a single total payment, and the network 
determines both the rate allocations, as well as the divisions of the users’ total pay- 
ments among the links; in the single link case, this scheme reduces to that studied 
in Section 12.2. But Hajek and Yang [13] have shown that Kelly’s mechanism can 
result in Nash equilibria in which the aggregate utility is an arbitrarily small fraction 
of the optimal aggregate utility. It remains an open problem whether there exists a 
network resource allocation mechanism in which each user submits a single number, 
representing total payment, and which has some nontrivial efficiency guarantees. 


12.5.2 Multiple producers, inelastic demand 


The models presented thus far consider consumers competing for resources in 
scarce supply. Motivated by current problems in market design for electric power 
systems, we consider a model where multiple producers compete to satisfy an in- 
elastic demand. Demand for electricity, particularly in the short run, is characterized 
by low elasticity with respect to price, i.e., changes in price do not lead to signif- 
icant changes in the level of demand; see, e.g., [31], Section 1-7.3. A basic model 
for electricity market operation involves supply function bidding [20]: each genera- 
tor submits a supply function expressing their willingness to produce electricity as 
a function of the market clearing price. A single price is then chosen to ensure that 
supply matches the inelastic demand. 

Most previous work on supply function bidding has focused almost entirely on 
using the supply function equilibrium (SFE) framework of Klemperer and Meyer 
[20] for its predictive power. In such models, generators can submit nearly arbitrary 
supply functions; the Nash equilibria of the resulting game are used to give insight 
into expected behavior in current markets. In other words, by solving the SFE model 
for an appropriate set of assumptions, most previous work hopes to lend insight into 
the operation of power markets which require generators to submit complete supply 
schedules as bids [1, 6, 9, 10, 29, 33]. But because there may be a multiplicity of 
equilibria, an explicit understanding of efficiency losses in these games has not been 
developed. Papers such as the work of Rudkevich et al. [29] do suggest, however, that 
in the presence of inelastic demand, price anticipating behavior can lead to significant 
deviations from perfectly efficient allocations. 

For this reason we take a different approach (see Chapter 4 of [15]). We consider 
restrictions on the supply functions which can be chosen by the generators, and aim 
to design these restrictions so that nearly efficient allocations are achieved even if 
firms are price anticipating. Formally, we assume that each firm n has a convex cost 
function Cn, as a function of the electricity generated. An efficient production vector 
minimizes the aggregate cost )7,, C;,(sn), subject to the constraint that the total pro- 
duced electricity }°,, Sn must equal the demand D. We then consider the following 
market. Each firm submits a supply function of the form S(p, w) = D — w/p, where 
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D is the fixed (exogenous) demand and w is a nonnegative scalar chosen by the firm. 
The market then chooses a price so that aggregate supply is equal to demand. 

If we assume that firms are price taking, it is possible to show that there exists 
a competitive equilibrium; furthermore, at this competitive equilibrium the resulting 
allocation minimizes aggregate production cost. If we assume instead that firms are 
price anticipating, we can establish existence of a Nash equilibrium and uniqueness 
of the resulting production vector, as long as more than two firms compete. Next, we 
consider the aggregate production cost at a Nash equilibrium relative to the minimal 
possible aggregate production cost. As long as more than two firms are competing, 
we show that the ratio of Nash equilibrium production cost to the minimal produc- 
tion cost is no worse than 1 + 1/(N — 2), where N is the number of firms in the 
market. Furthermore, we demonstrate that this efficiency loss result carries over even 
to a setting where demand is inelastic but stochastically determined, by showing that 
in such an instance it is as if firms play a game with deterministic demand but differ- 
ent cost functions. Finally, a characterization theorem, similar to the one in Section 
12.4, is also available, indicating that the mechanism under study has the best pos- 
sible efficiency guarantees, within a class of mechanisms in which the generators 
are restricted to submitting a supply function chosen from within a restricted, one- 
parameter family. 

These results, which have been established in [15], suggest that market power can 
be controlled, and efficient allocations guaranteed, by restricting the supply functions 
available to generators in electricity markets. Restricted families of supply functions 
have also been considered elsewhere in the literature, e.g., in [3]. However, these 
models are typically used as approximations to unconstrained supply function bid- 
ding, and thus the resulting efficiency loss has not been studied. Still, this work leaves 
many open questions; in particular, the dynamics of power systems, together with 
their complex network structure, has not been captured in the models developed (in 
contrast to the telecommunications models previously discussed). Furthermore, the 
work described here depends on convexity assumptions on the cost functions of the 
producers, and such assumptions may generally not hold in electricity markets [14]. 
Finally, away from a Nash equilibrium, e.g., if some generators do not act rationally, 
the remaining generators may have to produce electricity at highly undesirable or 
even impossible levels. Addressing these types of questions is the subject of current 
research. 


12.6 Open Issues 


We have discussed the efficiency properties of Nash equilibria associated with 
certain resource allocation mechanisms. For the case where there is a single available 
resource (respectively, a single demand to be satisfied), the mechanisms involve the 
submission of a demand (respectively, a supply) function, which can be specified 
in terms of a single parameter, followed by market-clearing. In each case, we have 
provided a tight bound on the worst case efficiency loss. It remains to understand 
the worst case efficiency loss when mechanisms belonging to broader classes are 
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considered. For example, in the context of Section 12.2 what efficiency guarantees 
are possible if users can choose a demand function from within a two-parameter 
family of demand curves? 

Another research direction relates to the study of natural adjustment dynamics in 
the context of various mechanisms. Indeed, a desirable mechanism should not only 
have efficiency guarantees for the resulting Nash equilibria. It should also allow for 
simple adjustment algorithms whereby the different players can converge, in a stable 
manner, to such a Nash equilibrium. 
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Summary. We discuss how decentralized network resource allocation problems fit within the 
context of mechanism design (realization theory and implementation theory), and how mech- 
anism design can provide useful insight into the nature of decentralized network resource 
allocation problems. The discussion is guided by the unicast problem with routing and Qual- 
ity of Service (QoS) requirements, and the multi-rate multicast service provisioning problem 
in networks. For these problems we present decentralized resource allocation mechanisms that 
achieve the solution of the corresponding centralized resource allocation problem and are in- 
formationally efficient. We show how the aforementioned mechanisms can be embedded into 
the general framework of realization theory, and indicate how realization theory can be used 
to establish the mechanisms’ informational efficiency in certain instances. We also present a 
conjecture related to implementation in Nash equilibria of the optimal centralized solution of 
the unicast service provisioning problem. 


13.1 Introduction: Motivation and Challenges 


Today’s fast paced world requires a vast amount of information exchange in or- 
der to operate efficiently. With the various technological advances the number of 
types of services being offered (e.g. telephone connections, live audio broadcasting, 
live video broadcasting, library database access, e-mail, world wide web), is con- 
stantly increasing. Each type of service imposes different Quality of Service (QoS) 
requirements (e.g. delay, percentage of data packet loss, jitter) on the delivery meth- 
ods. To address these needs extensive communication networks were developed in 
the past century. Many of these networks (such as telephone networks) were initially 
designed for the delivery of certain types of information and were later adapted to 
accommodate new information exchange needs. 

Most of today’s networks, called integrated services networks, support the deliv- 
ery of a variety of services to their users. One of the main challenges in integrated 
services networks is the design of resource allocation strategies which guarantee the 
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delivery of different services, each with its own QoS requirement, and maximize 
some performance criterion (e.g. the network’s utility to its users). The challenge in 
determining such resource allocation strategies comes from the fact that the network 
is an informationally decentralized system. 

The topic of resource allocation for informationally decentralized systems has 
been explored in great detail by mathematical economists in the context of mech- 
anism design. In this chapter we discuss how decentralized network resource allo- 
cation problems fit within the context of mechanism design, and how ideas from 
mechanism design can provide useful insight into the nature of decentralized net- 
work resource allocation problems. We first present a brief history of the develop- 
ment of the ideas that led to the current state-of-the-art of the theory of mechanism 
design (Section 13.2.1). Then, we present the key features of the two components of 
mechanism design, namely, realization theory (Section 13.2.2) and implementation 
theory (Section 13.2.3). To illustrate how network resource allocation problems fit 
within the context of mechanism design, and how mechanism design can be used 
to provide insight into the nature of network resource allocation problems, we con- 
sider two classes of network problems: unicast service provisioning with routing and 
QoS requirements, and multi-rate multicast service provisioning. We discuss these 
problems in Section 13.3 from the realization theory point of view. We investigate 
unicast resource allocation with routing from the implementation theory point of 
view in Section 13.4. We conclude in Section 13.5, by summarizing our discussion 
and identifying some open problems. 


13.2 Mechanism Design 


13.2.1 Historical background 


Traditionally, economic analysis treated economic systems as one of the “givens.” 
That is, it was assumed that for a given problem the structure of the economic system 
considered in order to generate a solution is fixed. 

At the turn of the last century, economists started to question the effect that the 
structure of the system has upon the solution of the problem. Although the search for 
a “better system” has been around at least since Plato’s Republic, this issue became 
more relevant with the emergence of the socialist and capitalist economic systems. 

One of the major issues that arose from the debate surrounding the virtues of the 
socialist and the capitalist systems was the methodology in which resources should 
be allocated. From the early stages of the debate most economists envisioned that 
the resources in a socialist system would be allocated by the use of a centralized 
coordinator, while in a capitalist system the resources would generally be allocated 
through the use of a market. This debate attracted a lot of attention, with promi- 
nent economists like Bukharin [15, 16], Dickinson [26], Doob [29], Kautsky [61], 
Lange [70], Lenin [72], Lerner [73, 74], Marschak [80], Neurath [93], and Taylor 
[130] arguing in favor of the socialist system; Pierson [105], von Hayek [135-139], 
and L. von Mises [140] arguing in favor of the capitalist system; and Barone [10], 
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Pareto[99-101], and Walras [141] contributing to the mathematical foundations.? 


Yet, with all these contributions, very few fundamental results on resource allocation 
theory were available until the 1930s. One of the major reasons for this was the lack 
of mathematical tools required to tackle such problems. The research efforts in the 
1930s along with the subsequent mathematization of classical welfare economics, 
von Hayek’s work, and the developments on mathematical programming and game 
theory set the mathematical foundations for the development of the theory of mech- 
anism design. 

In the 1930s three major research efforts relevant to the design of allocation 
mechanisms began: i) the development of resource allocation methods for the social- 
ist economy (with major contributors being Lange [70], Lerner [73, 74] and Taylor 
[130]); ii) the efforts of Hotelling [37, 38] and Lerner [73] on marginal cost pricing 
and consumer-producer surplus; and iii) the development of the “new welfare eco- 
nomics” (with major contributions by Hicks [36], Kaldor [55], and Scitovsky [21]). 
A decade later researchers began developing the mathematization of “classical wel- 
fare economics” [3, 4, 7, 9, 24, 25, 66, 71]. 

In the late 1930s and early 1940s, von Hayek made the following key observa- 
tions: (i) the amount of available information required and the amount of calcula- 
tions needed by a central-control system in order to determine an optimal allocation 
would be enormous; and (ii) the economic incentives provided by the market econ- 
omy could not be reproduced by any of the socialist models. Von Hayek argued that 
even with the use of “fast” algorithms, the problem required to be solved may be 
overwhelming and no human or computer could calculate a solution. Von Hayek 
also argued that the process of placing the “right” information in the hands of the 
computing and decision making agencies may be very difficult. Since information is 
dispersed throughout the economy (with no agent having full knowledge of the state 
of the economy), it must be communicated among the economic agents in order for 
a solution to be determined. This information exchange may be very costly, and in 
many cases it may be impossible for a central-control agent to have full knowledge 
of the state of the economy. 

Following the initial results of the 1930s, three major lines of research had a great 
influence on the development of resource allocation mechanisms and helped to es- 
tablish a more comprehensive understanding of the main features of such problems: 


e activity analysis and linear programming (Dantzig, Kantorovich, Koopmans), 

e game theory and iterative solution procedures (von Neumann and Morgenstern, 
George Brown, Julia Robinson), 

e investigation of the relationship between linear/nonlinear programming, two per- 
son zero-sum games and Lagrange multipliers (Gale, Kuhn, Tucker). 


The early breakthroughs in the field of linear programming greatly influenced 
the mathematization of classical welfare economics. Although linear programming 
models are not able to handle goal conflicts due to the multiplicity of consumers as 


3 For a more detailed presentation of the historical aspects of the socialist controversy we 
refer the reader to [125]. 
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well as constraints arising from the decentralization of information, they are still a 
very important step in analyzing and understanding multi-objective problems. 

Game theory is concerned with the interactive behavior of “rational” man. It is 
the study of mathematical models of conflict among rational decision makers. Game 
theory provided general mathematical techniques for analyzing situations where two 
or more decision makers’ decisions influence one another’s welfare. As such, game 
theory offered insights of fundamental importance for researchers in many branches 
of social sciences and technology, including resource allocation mechanisms. 

Understanding the interplay between mathematical programming techniques and 
zero sum games, along with the use of Lagrange multipliers (interpreted as shadow 
prices in economic systems) helped to develop tools for analyzing general mecha- 
nism design problems and provided a better perspective on solution methods. 

The original work in classical welfare economics, von Hayek’s observations, and 
the development of linear programming, game theory and Lagrange multipliers, set 
the foundations for the formal development of mechanism design in terms of both 
the “realization” and the “implementation” of Social Choice Rules (SRC’s) (goal 
correspondences) by decentralized economic systems. Realization theory and imple- 
mentation theory are the two basic components of mechanism design. We briefly 
present the key features of realization and implementation theory next. 


13.2.2 Realization theory 


Formally, resource allocation problems can be described by the following triple: 
environment, action space, and goal correspondence. We define the environment E 
of such problems to be the set of individual endowments, the technology, and pref- 
erences, taken together. More generally, the environment is defined as the set of cir- 
cumstances that cannot be changed either by the designer of the mechanism or by the 
agents. The action space A of the problem is considered to be the set of all possible 
actions, (e.g. resource exchanges) conducted by the various agents. Finally, the goal 
correspondence 7 is the map from E to A which assigns for every e € E the set of 
actions in A which are solutions to the resource allocation problem. 

The setup described above corresponds to the case in which one of the agents has 
enough information about the environment so as to determine the actions that would 
satisfy the goal correspondence (i.e. the information in the systems is centralized). 
Generally this is not the case. Usually, different agents have different information 
about the environment (i.e. we have an informationally decentralized system). For 
this reason it is desired/necessary to devise a message exchange process among the 
various agents that eventually enables them to jointly take an action which corre- 
sponds to a solution of the centralized problem. We call such a process of communi- 
cation, decisions and actions a resource allocation mechanism. 

The function of a resource allocation mechanism is to guide the agents (economic 
or otherwise) to make decisions that determine the flow of resources. More specif- 
ically, mechanisms provide rules, called response rules (or communication rules), 
according to which agents communicate messages to other agents. These messages 
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are generally generated based on the agents’ “private” information about the environ- 
ment and prior messages received from other agents. To provide for a transition from 
the dialogue to decisions and actions, the mechanism must also have an outcome rule 
which specifies what actions are to be taken given the course of the dialogue. Gener- 
ally, the mechanism rules may be deterministic or probabilistic; mathematically they 
are expressed as functions or correspondences. 

For simplicity we are going to consider the case in which our mechanism can 
be represented by a tatonnement process. This process consists of a communication 
stage in which agents exchange formal messages in an iterative fashion, followed 
by a decision process, and finally a translation of decisions into actions. The case in 
which the communication, decisions, and actions overlap in time (a non-tatonnement 
process) require more general theory and will not be discussed here. 

The first effort to formally study resource allocation mechanisms can be traced 
back to Hurwicz’s work [40-43]. Hurwicz models the communication process by 
means of a language and response functions, specifying how each agent determines 
the message to be emitted at each stage of the iterative exchange of messages. After 
the process of communication terminates, decisions are determined on the basis of 
the state of information at the final stage of communication. 

Formally, the mechanism model proposed by Hurwicz can be described by the 
triple (M, u, h) : a message space M, an equilibrium message correspondence p, 
and an outcome correspondence h. The message space is the set of messages that 
may be exchanged by the agents. The equilibrium message correspondence describes 
the sets of messages that the agents “agree” upon given any particular environment. 
The outcome function describes the set of actions that are taken based on a particu- 
lar set of “equilibrium” messages. The formulation above is depicted graphically in 
Fig. 13.1 (cf. [110]). 





u h 
M 


Fig. 13.1. Message exchange for a decentralized system. 


Realization theory is concerned with the existence and design of mechanisms 
(M, u, h) such that the diagram in Fig. 13.1 commutes. Hurwicz’s setup is quite 
general and can incorporate many types of mechanisms. Given a specified goal corre- 
spondence (alternatively called social choice rule or social welfare correspondence) 
there may be several mechanisms (M, ju, h) such that the diagram of Fig. 13.1 com- 
mutes. Each of these mechanisms may have different “communication” and “infor- 
mation processing” characteristics. For example, in a market mechanism with an 
“auctioneer” the messages exchanged could be prices and demands. In this model 
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the auctioneer updates the prices according to the “excess demand” while the agents 
update their demands based on the prices. In such a situation the message space M is 
small but the “information processing” required until a final action is taken is large, 
mainly because the process is iterative and the number of required iterations may 
be very large (in theory they may be infinite). On the other hand, in a “central com- 
mand” type of mechanism the messages are the signals the agents send to a “central 
authority” so as to describe their environments. After receiving the agents’ messages, 
the central authority calculates an optimal allocation of resources and sends the or- 
der for action to the agents. In this situation, if the space of environments is “rich” 
the “dimensionality” [69] of the message space required for communication is very 
large. On the contrary, the amount of information processing is small because it takes 
only one iteration to implement a centralized decision. 

The characterization and classification of mechanisms in terms of their “commu- 
nication” and “information processing” requirements is an open research area. So 
far, research concentrated mainly on the “communication” requirements specifically 
on the “dimensionality” [39-44, 81, 88] of the message space M required so that the 
diagram in Fig. 13.1 should commute. Mechanisms (M, ju, h) that posses the afore- 
mentioned commutative property, have a message space M of minimum “dimen- 
sionality,” and satisfy some additional requirements (described below), have been 
called “informationally efficient.” The characterization and comparison of mecha- 
nisms according to their “information processing” requirements has received very 
little attention. In the sequel we will state and discuss more precisely the conditions 
under which “realization” theory was developed. 

The following requirements are generally imposed on Hurwicz’s models: 


R1. For each element of the environment e € E there exists a non-empty set 
of possible feasible actions. The notion of feasibility can usually be split into two 
categories individual feasibility and compatibility. In particular: 


1. In standard models of production economies, an agent’s individually feasible 
actions are defined to be the set of actions formed by the agent’s production 
function. Within the context of the network problems considered in Section 13.3 
a user’s individual feasible actions are formed by the set of non-negative de- 
mand vectors. On the other hand, the network’s individual feasible actions are 
formed by the set of amounts of services that are delivered and satisfy certain 
QoS requirements and the network’s capacity constraints. 

2. We call an action incompatible if given two different input-output vectors of 
two agents, one calls for an input which the other does not propose to supply. 
Within the context of the network problems considered in Section 13.3 an action 
is incompatible if in equilibrium a user requests an amount of service which 
differs from the amount of service the network intends to supply. 


R2. For each element of the environment the set of feasible actions that satisfy 
the goal correspondence is non-empty. 


R3. The actions generated by m must satisfy some sort of optimality criterion. 
Examples of such criteria are: efficiency of production (defined by Koopmans [65]), 
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optimality (introduced by Pareto [100] under the name of “ophelimity”* maximiz- 


ing), and socially welfare maximizing (defined by Bergson [13], Samuelson [113] 
and Arrow [6, 8]). 


R4. For any environment e € E, pu(e) 4 Q; that is, for any environment there 
exists a set of messages to which all agents “agree.” 


R5. The maps 7,y and h satisfy the following relationship: 
h(u(e)) € r(e) Vee. (13.1) 


In other words, for any system environment, in equilibrium, the messages ex- 
changed by the system agents enable the agents to take actions which achieve an op- 
timal centralized solution. Mechanisms satisfying this assumption are also referred 
to as non-wasteful. 


R6. The non-wasteful criterion established above can sometimes be inadequate. 
For example we can have the case where the equilibria of a given process always 
favors one group of participants at the expense of others. We call such mechanisms 
biased. To avoid biased mechanisms we require unbiasedness. 

A formal test for unbiasedness can be viewed as follows: Suppose that we think 
of our process as being formed of two stages. In the first stage the process “dis- 
tributes parameters” (e.g. resources, information etc.) to the various agents, while in 
the second stage we have a tatonnement process. If for any environment e € E and 
any goal realizing action a € 7(e) there exists a set of distributional parameters such 
that at the end of the tatonnement process the agents take action a, then the process 
is called unbiased. 


R7. For any environment, the rules of the process lead the system to a uniquely 
determined allocation. This requirement may be difficult to satisfy even in the case of 
market-based economies. In such economies there may be multiple allocations which 
are optimal for a fixed set of prices, however, all of these allocations have the same 
utility for all the agents. We call such processes, where equilibrium indeterminacies 
are trivial in nature, essential single-valued. 


R8. There are two types of information regarding the environment agents have 
access to: direct and indirect. The agent’s direct information is obtained through 
observations of the environment. The indirect information is gathered by the agent 
through the exchange of messages with other agents. We assume that an agents’ 
direct information is information pertaining only to himself and not to other agents. 
We will refer to processes satisfying this property as informationally consistent. 

When considering the equilibrium messages generated by agents, we call a pro- 
cess privacy preserving when all the agents generate their messages based only on 


4 Pareto was troubled with the concept of ‘utility.’ In its common usage utility meant the 
well-being of the individual or society. Pareto realized that when people make economic 
decisions they are guided by what they think is desirable for them whether or not that 
corresponds to their well-being. Thus, he introduced the term “ophelimity” to replace the 
worn-out ‘utility.’ Later, preferences replaced Pareto’s ophelimity. 
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their own information about the environment. Hurwicz’s model restricts attention to 
privacy preserving resource allocation mechanisms. 


R9. Assume that the agents communicate with one another through a communi- 
cation alphabet which permits them to communicate in one shot their direct infor- 
mation (profile) to the other agents. Such a language in most cases is too complex 
for consideration and hence is undesirable. Restricting the language may also not 
be enough to alleviate this problem. Agents may be able to encode in a relatively 
“simple” language their full profile. This type of encoding though may be done by 
the use of equilibrium message and outcome correspondences which are highly dis- 
continuous and resemble such functions as the Peano space filling curves [5]. Such 
mechanisms are generally highly unstable (hence undesirable) since minor perturba- 
tions/errors in communication will lead to drastically different/nonoptimal actions. 
In order to alleviate problems such as the above, we introduce the following require- 
ment: We impose extra conditions, such as spot threadiness, on the correspondences 
uand h. 


Definition 1. A correspondence F : E —> M is spot threaded if for every e € E 
there exists an open set Ue C E, and a continuous function f : Ue — M such that 
fle) € F(e’) for alle! € Ue. 


We note that the first three requirements R1-R3 are constraints on the type of 
problems considered, and they are defined independently of the mechanism. The 
next four requirements R4—R7 are imposed on the mechanisms to be considered 
and are generally referred to as (Pareto) satisfactoriness®. Mechanisms that satisfy 
R8-R9 are called regular. 

Realization theory was developed using subsets of the requirements above. To 
proceed with a more formal description of the results on communication and infor- 
mation processing requirements we need the following definitions: 


Definition 2. We say a mechanism (M, u, h) is goal realizing if it satisfies require- 
ments R1, R2, R4 and R5. 


Definition 3. We say that a mechanism (M, 1, h) is informationally efficient if it is 
goal realizing and regular and it has a message space of a dimensionality which is 
minimal among all the other goal realizing and regular mechanisms. 


To the best of our knowledge, this definition of informational efficiency is differ- 
ent from that appearing in most of the literature on realization theory. Our definition, 
compared to the definition appearing in the literature, imposes more requirements on 
the properties an informationally efficient mechanism must satisfy. We show that the 
decentralized network resource allocation mechanisms we present in Section 13.3 
are informationally efficient according to our definition of informational efficiency. 

Most of the research on realization theory has dealt with the discovery of goal 
realizing mechanisms that have a message space of minimum dimension among the 
message spaces of goal realizing and regular mechanisms. Some of the key results 


5 In many models requirement R7 is omitted in the definition of Pareto satisfactoriness. 
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are: (i) the competitive process is Pareto satisfactory over classical environments® 
[88, 89]; Gi) for classical environments, the competitive mechanisms are goal realiz- 
ing and have a message space of minimum dimension among the message spaces of 
goal realizing and regular mechanisms [40, 44, 88]’; (iii) competitive mechanisms 
are informationally efficient for classical environments where the utility functions are 
of the Douglas Cobb form [88]; (iv) for environments with public goods the Lindahl 
mechanism is goal-realizing and has a message space of minimum dimension among 
the message spaces of all goal-realizing and regular mechanisms [43, and references 
therein]; (v) if the dynamics of allocation mechanisms are considered explicitly and 
stability is required then the size of the message space has to increase [53, 90]. 
Work in [56, 108, 109] addressed issues related to the complexity of informa- 
tion processing of goal-realizing mechanisms that have message spaces of minimum 
dimension among the message spaces of goal-realizing and regular mechanisms. 


13.2.3 Implementation theory 


It is well known that the theory of organizational control systems is concerned 
with two types of rules: operational and enforcement. The operational rules describe 
how the system “should” operate, while the enforcement rules assure that the opera- 
tional rules are followed. Enforcement rules fall within two categories: explicit and 
implicit. While explicit enforcement rules generally use monitoring techniques in 
order to control agents’ behavior in a system, implicit enforcement motivates agents 
behavior by providing appropriate incentives. 

In the previous section we defined a mechanism to be a set of operational rules, 
according to which the system’s agents generate messages which lead to desired 
actions. The question that arises is: Can we expect the agents to follow such rules? 
The answer to the question above is provided by the theory of implementation. 

The theory of implementation is generally concerned with strategic behavior of 
allocation procedures, and generally studies implicit enforcing rules. It is concerned 
with the design/discovery of “game forms” that implement, in some behavioral equi- 
librium (solution concept), social choice rules/goal correspondences. 

Specifically, N-agent “game forms” are defined as pairs of the form (M, h), 
where M = ee Mi, M;i is the message space of agent i, i = 1,2,...,N, 
and h: M — A. Thus, for each profile m := (m1, M2,..., My) of messages, 
h(m) € A represents the resulting outcome or allocation. A game form is different 
from a game as the consequence of a profile of messages is an outcome (alloca- 
tion) rather than a vector of utility payoffs. Once a preference profile, i.e. a com- 
plete, binary and reflexive preordering R(e;), that describes the it” agent’s prefer- 
ences over alternatives in A when i’s environment is e; € E;, is specified for each 
i =1,2,...,N,a game form induces a game. 


6 A classical environment is defined to be a convex economy (i.e. concave utility functions 
and convex constraint sets), free of externalities (an externality is present when wellbeing 
of an agent is directly affected by the actions of another agent). 

T In [40, 44, 88] it is not required that the competitive mechanism should be regular in the 
whole space of environments. 
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The principal difference between a game form and Hurwicz’s original model, 
described by (M, ju, h), is the following: In game forms the message correspondence 
(described by u in Hurwicz’s model) is not a design variable, but is induced by the 
outcome function and the behavioral equilibrium concept (e.g. Nash, Bayesian Nash, 
maxmin, undominated etc.). 

A solution concept (or equilibrium concept) specifies the strategic behaviors of 
agents (individuals, users) faced with a game form (M, h) given a preference profile 
Rie) := (Rei), R(e2),...,R(en)). Hence, a solution concept is a correspon- 
dence A that identifies a subset of M for any given specification (M, h, R(e)). We 
define 

Qı := {a € Alame A( (M, h, R(e)) ) s.t. h(m) = a} (13.2) 





as the set of outcomes associated with the solution concept A. 

To illustrate (13.2) consider a pure strategy Nash equilibrium as the solution 
concept. For any given (M, h, R(e)) a pure Nash equilibrium is a message m := 
(mı, M2,..., my) E M such that 


for alli = 1,2,..., N, and all m; € M;, where 
Mi := (M1, M2,- <, Mi—1; Mits- -3 Mn). 


Denote the messages satisfying (13.2) by NE( (M, h, R(e)) ). Then the set of 
associated outcomes is 


Qnel (M, h, R(e)) ) := {a € A|am € NE( (M, h, R(e)) ) s-t. h(m) = a}. 
(13.4) 
To precisely define how social choice correspondences are implicitly enforced 
via game forms in some behavioral equilibrium we need the following: 





Definition 4. A social choice correspondence 7 : E — A is implemented by the 
game form (M, h) via the solution concept A if 


Qa( (M, h, R(e)) ) = z(e) 
foralle € E. 


Definition 5. A social choice correspondence mn : E — A is said to be imple- 
mentable via the solution concept Q if there exists a game form (M, h) that im- 
plements it. 


The form of implementation above is called full implementation or strong imple- 
mentation since it requires that the outcomes of a game form coincide with those of 
the social choice correspondence. A weaker form of implementation (called weak 
implementation) is one where for every e € E, 


Qa( (M,h, R(e)) ) C me). 
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A comparison between full implementation and weak implementation is pre- 
sented in Thompson [133]. 

Within the context of implementation theory there have been significant devel- 
opments in the characterization of social choice rules that can be implemented in 
dominant strategies [20, 32]; in Nash equilibria [83—85, 112, 143]; or in refined Nash 
equilibria such as subgame perfect equilibria [2, 87], undominated Nash equilibria 
[1, 46, 48, 96], trembling hand perfect Nash equilibria [123]; or in Bayesian Nash 
equilibria [45, 95, 97, 106]. Excellent survey articles on implementation theory are 
[47, 85, 94]. These articles summarize the state of the art on implementation theory 
up to the time of their publication. 

Direct revelation game forms (otherwise called direct mechanisms or direct rev- 
elation mechanisms) is a particular class of game forms that have a natural appeal 
and have received significant attention. In direct revelation game forms M; = FE; 
for each agent 7. In effect then each agent reports his own environment, but not 
necessarily his true one. The interest in direct revelation game forms stems from 
the revelation principle. The revelation principle is the observation that if a game 
form (M, h) implements a social choice correspondence m (m : E —> A), then 
there exists a direct revelation game form (Æ, h*) which has the following prop- 
erties: (1) announcing one’s true characteristic is an equilibrium message; and (2) 
h*(e1,€2,-.-,en) = h*(e) € z(e) for all e € E. Even though the direct reve- 
lation game form has the aforementioned properties, it does not necessarily imple- 
ment the social choice correspondence 77. This is because the direct mechanism may 
have multiple equilibria which give rise to outcomes which, for some e € E are 
not in 7(e) (see [20]). Thus, one cannot conclude from the revelation principle that 
all one ever needs to consider are direct revelation game forms. Only under certain 
conditions (see [20]) a social choice rule 7 can be implemented by a direct reve- 
lation game form. Most of the literature on implementation in dominant strategies 
and in Bayesian equilibria has used truthful implementation, an implementation con- 
cept that requires only that the truthful equilibrium of a direct revelation game form 
(E, h*) be in the choice set, i.e., h* (e) € m(e) Ve € E. 


13.3 Mechanism Design in Networks: 
A Realization Theory Point of View 


To illustrate how mechanism design can be used in networks we present two 
classes of network resource allocation problems and discuss them from the realiza- 
tion theory point of view. The two classes of problems are: 1) resource allocation 
in unicast with routing and end-to-end Quality of Service requirements; and 2) rate 
allocation in multi-rate multicast service provisioning. For these two problems we 
present two distinct pricing mechanisms which achieve the solution of the central- 
ized resource allocation problem, satisfy the informational constraints imposed by 
the decentralization of information in networks, and are informationally efficient. 
We show how these mechanisms can be embedded into Hurwicz’s abstract frame- 
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work; in [125] we have shown how ideas from realization theory can be used to 
establish the pricing mechanisms’ informational efficiency. 


13.3.1 Unicast with routing and Quality of Service requirement 
Problem formulation 


We consider a set of users/agents, denoted by N = {1,2,..., N}, requesting 
various services from a network. For each user i € N we denote by M; the set of 
types of services requested by that user. For each i € N each service 7 € M; must 
satisfy some sort of end-to-end Quality of Service (QoS) requirements denoted by 
Fij. Assume that user i’s preference over the set of services it requests is summarized 
by a utility function U;(%;), where T; € R™M:, We consider the network to be the 
(N + 1)*” agent. 

The network is formed by a set of links L. For every | € L, K; denotes the set 
of resources on link l. Denote by K = Q 1eL Ki the set of resources available at the 
different links of the network, and by the vector cx the amount resources at those 
links. Define Ty, to be the topology of the network and Rr, to be the set of possible 
routes over which each service requested from the network can be delivered. For 
each user i € N and for each service type j € M; denote by F;; (Rr, , K) the set of 
all resource allocations along all the possible routes of service 7 that guarantee the 
end-to-end QoS requirements F;;. Also, fort € Rr}, i € N, and 7 € M; denote 
by F;;:(Rr,,K) the set of all resource allocations along route t that guarantee the 
end-to-end QoS requirements F;,;. 

The goal of the network is to allocate resources to the various services in order to 
maximize a social welfare function described by the sum of the user utilities, while 
satisfying the QoS requirements imposed by the offered services. Hence, the goal of 
the network is: 


max U; (Zi) P 
ae: icN 
subject to: 
iE RM;,; P.a 
rbi e Fi; (Rri; K); P.b 
D D alt sau Pe 
iEN JEM; 


For each user i € N, each service j € Mg, any set of routes Rr, and any resource 
availability K, the set F;;(Rr,, K) is well defined, compact and non-empty; P.d 


The users’ utility functions are concave, strictly increasing and continuously differ- 
entiable. P.e 


In P.b and P.c, r stands for the amount of resource of type k on link / assigned 


to the service of type j requested by user i, r‘ represents the vector of resources 
allocated to user 2 for service j, and c;,,, is the amount of resource of type k € Kı. 
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In addition, the following informational constraints are present: 


The network has no information about the users’ utility functions or the number of 
users interested in services. Pf 


Each user’s preferences over the particular services is his private information. The 
users are unaware of the topology of the network, the amount of resources available 
on each link, and method by which the network delivers their services. The users are 
also unaware of the number of other users requesting services from the network, or 
their utility functions. Pg 


The assumption that the network manager has complete knowledge of the net- 
work topology and resources is not an unrealistic one. For example, a corporate in- 
tranet or VPN (virtual private network) may have a single provider of resources and 
services, who is likely to have such knowledge about the network, and who will as- 
sume the roll of network management in collecting aggregate excess demand on links 
and adjusting link prices. In particular, some resource/service providers use very so- 
phisticated network management tools to monitor in real time the proper functions of 
a network (e.g., events such as congestion, fault, server ups and downs), and to issue 
appropriate response/commands. Such monitoring requires complete knowledge of 
the network (e.g., topology, resources, router/link capacities), as well as separate net- 
work management protocols to pass information to and from the management site. 
These tools can easily be used to acquire information on aggregate excess demands 
and to adjust link prices. 

The goal in unicast with routing and QoS requirements is to determine a mech- 
anism that allocates resources in order to generate services for individual users, sat- 
isfies the QoS requirements for all the services delivered, is social welfare maxi- 
mizing’, and satisfies the aforementioned informational constraints. To achieve this 
goal we present a market mechanism, which results in a solution of the centralized 
optimization problem P-P.e and satisfies the informational constraints P.f-P.g. 

Unicast service provisioning has received significant attention. Most of the re- 
sults on decentralized resource allocation in unicast service provisioning, currently 
available in the literature are based on pricing mechanisms [17, 19, 22, 33, 49, 54, 
62, 64, 76, 78, 79, 91, 92, 103, 104, 131, 142]. These publications have addressed, 
either by analysis [19, 22, 33, 49, 62, 64, 76, 78, 92, 131, 142], or simulation and 
analysis [17, 33, 91, 103, 104], a subset of the issues outlined in the goal of the uni- 
cast problem stated in the previous paragraph. A significant number of publications 
have dealt with single link networks [22, 91, 104, 142], or with the allocation of a 
single resource per connection [22, 33, 49, 64, 78, 91, 92, 104, 142]. 


Market mechanism 


We proceed as follows: First, we describe a competitive market economy con- 
sisting of service providers, users and an auctioneer. Then, within the context of this 


8 The social welfare function in this work is characterized by the sum of individual user 
utility functions. 
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market we specify a procedure, used by the auctioneer, which leads to a resource 
allocation that achieves a solution of Problem P. 


Description of the market 

In our market, for conceptual clarity, we assume that the network consists of a 
service provider and an auctioneer. Under this assumption, the economy consists of 
the following three types of agents: a service provider, users and an auctioneer. The 
auctioneer sets the prices per unit of resource at each link. The price of resource k 
at link / is denoted by ;,;,. The service provider and the users are price takers. They 
act as if their behavior has no effect on the equilibrium prices reached by the market 
allocation process. The service provider uses the network’s resources and the prices 
Aik» Specified by the auctioneer, to set up services and the corresponding prices for 
each unit of these services. Then, it announces the price per unit of service for each 
service to the users. Based on the announced prices, each user decides the type of 
services and the amount of each service it should request. 

We observe that the price taking assumption and the fact that we try to maximize 
the sum of the users’ utilities imply that: (i) the service providers will not attempt to 
make a profit; and (ii) the service prices are directly derived from the resource prices. 

Below we describe each type of agent in more detail. 


Service providers: The users request services from the service providers. Each of 
these requests is described by the origin, destination and the minimal level of quality 
of service required. The services are indexed by the (i, j) pair, with i € N repre- 
senting the user and j € M; representing the service type. For each pair (7, j) there 
exists a set T’ of possible routes that can be used. Denote by V;;z the set of links 
forming route t € T’. The service provider allocates resources r¢J"*(\) € Vijz so 
that the minimum cost for the service and the lowest acceptable level of quality of 
service are attained. We assume that each service can not be distributed over multiple 
routes. 

Since the service provider is not a profit maker, it allocates resources for each 
type of connection by solving: 


id . ijt 
redt(r) € argmin > J NKTE (13.5) 
PEPER OI eV kEK 


where i € N, j € Mj, t € T. For each (i, j) pair, equation (13.5) generates a 
set of allocations that result in a minimum price per unit of service for each route 
t € T”. Then, the service provider computes the price per unit of service for route 





t, _ 
ÉA = > > rane: (13.6) 
lEVijt KEK 
where re are determined by (13.5). Finally, the service provider computes, for 
each i € N, j € Mi, p;(A) = min p;2(A) (13.7) 
teTis 
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and announces the prices pi (A), i € N, j € Mi, to the users. If for some (i, j) there 
are two or more routes of minimum price, the service provider picks one of these 
routes. 


Users: Users request one way connections from the service provider. Based on the 
price (A), announced by the service provider, the users demand a number of con- 
nections determined by 


x'(p) € argmax ute -5 i0), ViEN, (13.8) 
xiceXi jeM; 


Auctioneer: The role of the auctioneer is to regulate the prices of the resources. He 
does this based on the aggregate excess demand vector z(A, t): 


an(A,t) = D7 do (<i anit") — tie (13.9) 


iEN GEM; 


where l € L, k € K, and n is determined by (13.5)-(13.7). 


The tâtonnement process 

We present a tatonnement process, specified by an algorithm, called Algorithm 
1, that describes how the market works. The algorithm proceeds iteratively as fol- 
lows: 


Step 1: The auctioneer announces prices À for the resources at each node of the 
network. The users announce their desired services to the service provider. 

Step 2: Based on the auctioneer’s announcement, the service provider computes 
the minimum price per unit of service according to (13.5)-(13.7). The service 
provider announces these prices to the users. 

Step 3: Based on the prices p announced by the service provider, the users request 
services in the amount z(p) satisfying (13.8). 

Step 4: Based on the service demand vector x(p), the auctioneer computes, through 
(13.9) the excess demand vector z(A). 

Step 5: If z(A) < 0 then the process ends. Otherwise the auctioneer changes the 
prices À of resources according to a specific mechanism based on Scarf’s algo- 
rithm which is described in detail in [125, 129, 132], announces new prices, say 
A’, and the process is repeated from Step 2 on. 


Embedding into Hurwicz’s framework 


We embed the unicast routing problem, formulated in Section 13.3.1, within the 
framework of Hurwicz’s model described in Section 13.2.2. 


The resource allocation problem: 
As presented in Section 13.2.2, a resource allocation problem can be described 
by the following triple: environment, action space, and goal correspondence. 
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Environment: “Characteristics” of a particular agent i, say ef, is called the lo- 
cal environment of i. The set of all possible environments of i is denoted by F’. 
The (system) environment is a tuple consisting of the local environments of all 
agents and is denoted by e. The set of possible system environments is denoted by 
E := (Qien EŻ) @ ENT}, 

For the network Problem P the local environment E’ of each user i is the set of 
differentiable concave functions on RK. The local environment of the network is the 
set EN+1 £ {{TL} x {Rr} x {K} x F,;(Rr,,K) x {cK}}. 


Action Space: The set of possible actions taken by the system is called the action 
space of the system, and is denoted by A . For Problem P the action space is the 
feasible region of P. 


Goal Correspondence: The relation between the environments and the (desired) ac- 
tions of the system is represented by a point-to-set map, called the goal correspon- 
dence / social choice rule / social welfare maximizing rule, and is denoted by 7. For 
Problem P, m : E —» A is defined as follows: 

m(e) := argmax P(e). (13.10) 


Mechanism specification 

A mechanism in equilibrium correspondence form is characterized by the fol- 
lowing triple (M, u, h), where M is the message space, ju is the equilibrium corre- 
spondence, and h is the outcome function. 


Message Space: The set of messages chosen for communication by the designer is 
called the message space, and is denoted by M. The size of a finite dimensional 
message space M is defined to be the dimension [69] of the smallest real vector 
space in which there is an open set W such that M C W. The size of M is denoted 
by dim M. 

In the allocation mechanism described above (see [129] for details), two types of 
messages are exchanged among agents: 


e The prices per unit of service for each service are communicated by the network 
to the users. We denote by p := (p1, p2,..-, Pq) the vector of prices, where q is 
the number of network services available for delivery. 

e The demands for services communicated by the users to the network. We denote 
by % := {xii € N, j € Mj} the vector of user demands. 


Thus, the message space for Problem P has dimension equal to the number of 
user demands ur |M;| (where |.| denotes the cardinality of the set) plus the num- 
ber of different services supplied by the network. 


Equilibrium Correspondence: The relation between the environment and a mecha- 
nism’s equilibrium messages is represented by a point-to-set map, called equilibrium 
correspondence, and denoted by u(: E > M). The individual equilibrium corre- 
spondence of participant i, denoted by uê : EŻ —> M, represents a relationship 
between the local environment of 7 and the terminal messages emitted by 7. 
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To capture the private nature of the initial distribution of information, we require 
the following privacy-preserving property to be satisfied: 


ple) = Nipi (e*). (13.11) 


To determine the equilibrium correspondence for the first N agents (the users) 
we define the Lagrangian function A(T, A) (see [12]) of Problem P: 


A(z) := XO SO aint + Alan >> So riri) (13.12) 


iEN JEM; iEN JEM; 


with A; being the Lagrangian multiplier corresponding to the kt” resource on the 
I? link. 

Using the first order optimality conditions, we define the equilibrium correspon- 
dence of the first N agents (the users) as follows: 





i(®i) — pig) = OF 
(13.13) 
where p is the vector of prices for the services requested by the users, 7 is the vector 
of demands requested by the users, z; is the vector of demands requested by user i 
and Ti is the demand of user 2 for service 7. 
From equation (13.12) and the Karush—-Kuhn—Tucker (KKT) conditions, the 
equilibrium correspondence for agent N + 1 (i.e. the network) is: 


pi WiC) = (2) € Mig Wile) = Pig < OT aT 


poor sy x Rr, x K x CK X Fi; n )) 
:= {(p, 7) yemM|S> X ri z} peers rd €F,;(Rr,,K), 


iEN JEM; 


irid) 
Anele Y > UT ie) 


iEN JEM; 


Pig = pD Tr LAG (13.14) 


lEL,kEK 


Outcome Function: A function which translates messages into actions is called an 
outcome function, and it is denoted by h(: M — A). In Problem P the outcome 
function is h(p,Z) := T. 


Key results 


The main features of the market mechanism of Section 13.3.1 are: 


e The mechanism achieves the solution of the centralized resource allocation prob- 
lem P and satisfies the informational constraints P.f and P.g. (see [125, 129, 132]). 
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e The dimension of its message space is a lower bound on the dimension of the 
message space of any goal-realizing and regular mechanism for routing in unicast 
with QoS requirements. (See [125].) 

e In the case of rate allocation with fixed routes and without QoS the mechanism 
is informationally efficient.’ (See [125, 128].) 


13.3.2 Multi-rate multicast 
Problem formulation 


Let N = {1,2,..., N} denote the set of users/agents requesting various services 
from a network. We assume that the network consists of a set of L unidirectional 
links, with a topology denoted by Ty, and each link l € L having finite capacity cı. 
There is a set M of multicast groups. Each multicast group is a tree. Each multicast 
tree m E€ M is specified by {5m, Rm, Lm}, where Sm is the unique source node, 
Rm is the set of receiver nodes, and Lp is the set of links used by the group. 

We denote by R £ Umem Rm the set of all receivers over all the multicast 
groups, and by Rı,m the set of all the receivers of multicast group m € M using link 
le L. 

We assume that a unique user is connected to each receiver node r € R. Each 
user r has a utility function U,(x,), where xp is the rate at which r receives data. 
This utility function can be interpreted either in terms of the perceived quality of the 
service received or the amount paid in order to receive the service. 

We make the following assumptions: 


Assumption 1. The utility functions U,(x,) are strictly concave, differentiable and 
increasing. 


Assumption 2. The rates x, are assumed to be continuous variables. 


Assumption 3. Rate allocations are done along fixed multicast trees with a fixed 
number of users. 


Under the assumptions above, we consider the following network multi-rate mul- 
ticast problem: 


max U, (ar) Max 1 
Zr rER 
rER 
subject to: 
5 max £r <c, VIEL, Max 1.a 
re Rim 
meM 
te = 0, VreR, Max 1.b 


° Our notion of informational efficiency imposes more requirements on the properties of a 
mechanism than the standard notion of informational efficiency. To the best of our knowl- 
edge an analysis similar to that presented in [125, 128] was conducted only within the 
framework of production economies where the agent utility functions are of the Douglas 
Cobb form [88] or quadratic form [81]. 
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the informational constraints P.f and P.g of Section 13.3.1, and the following addi- 
tional constraint: 


Users are unaware of the method used for service delivery (e.g. unicast vs. multicast). 
P.h 


Constraint Max 1.a is known as the capacity constraint. For this constraint to be 
satisfied, on each link, the sum of the rates used by each multicast tree can not exceed 
the link capacity. The capacity constraint ensures that for all the multicast trees, the 
rate on each branch of a tree is less than or equal to the rate on its parent branch. 

Assumption P.h is critical in what follows, as it justifies the price-taking assump- 
tion we make in the sequel. If the method of service delivery is known to the users, 
then the mechanism proposed in Section 13.3.2 for the solution of problem Max 1 
together with P.f-P.g may not be appropriate for multi-rate multicast service provi- 
sioning. This is because in this situation common links have features of public goods, 
and the mechanism proposed in Section 13.3.2 leads to the “free rider” problem [82, 
Chapter 11], which in the case of multi-rate multicast service provisioning manifests 
itself as follows: users who use a common link and demand less than the rate in the 
link do not participate in the price-sharing of the link (see [125, 126]). 

The multi-rate multicast problem with the features above is an information- 
ally decentralized resource allocation problem where there are two distinct types 
of agents: network (network manager) and users. A major difference between the 
multi-rate multicast (under the assumptions above) and unicast is the fact that users 
connected to the same multicast tree receive service over “common links.” Thus, to 
determine optimal (with respect to the performance criterion defined in Max 1) rate 
allocations in multi-rate multicast service provisioning one must find how the prices 
of “common links” should be shared by their users. 

The goal in multi-rate multicast is to develop a mechanism for rate allocation 
along the various multicast trees in order to: (i) generate services that maximize a 
social welfare function consisting of the sum of individual user utility functions; and 
(i1) satisfy the informational constraints P.f—P.h. 

Multicast service provisioning problems have received significant attention. Within 
the context of single rate and multi-rate multicast service provisioning, studies 
have addressed issues of bandwidth/rate allocation [23, 31, 57, 59, 111, 115- 
119, 122, 134], routing [27, 28, 102, 121, 144] and reliability [30, 34, 60]. Most of 
the literature on rate allocation is done via the notion of fairness [23, 31, 111, 115- 
119, 122, 134], specifically, max-min fairness [14] and proportional fairness [63]. 
In particular, [119] develops a unified framework for diverse fairness objectives via 
the notion of fair allocation of utilities. A more general approach to rate allocation 
is via utility maximization. Utility maximizing is more general because rate allo- 
cation with the fairness property is utility maximizing when the utility has a spe- 
cial form [23, 86, 119, 122]. Although utility maximization has been extensively 
studied within the context of unicast rate allocation to achieve congestion control 
[11, 58, 62, 64, 67, 68, 75, 129, 132], relatively fewer studies approached the multi- 
rate multicast allocation problem via a general utility maximization formulation, with 
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the notable exceptions being [23, 57, 59]. Problem Max 1 together with Assumptions 
13.3.2-13.3.2 and constraints Max 1.a, Max 1.b, P.f—P.h is similar in spirit to those 
formulated and analyzed in [23, 57, 59]. However, the decentralized resource alloca- 
tion mechanism presented in this chapter is different from the mechanisms proposed 
in [23, 57, 59]. The development here follows [125, 126], where all the details of the 
proposed algorithm and of the key results can be found. 

In the next section we present a market-based pricing mechanism that satisfies 
the informational constraints imposed by the nature of the network problem, and 
achieves a solution of the centralized optimization problem Max 1. This market 
mechanism is based on a price splitting algorithm and on properties of price splitting, 
all of which are presented formally in [126]. 


Market mechanism 


We proceed as follows: We first describe a competitive market economy con- 
sisting of two types of agents: network and users. Then, within the context of this 
market we specify an iterative procedure (a tatonnement process) which leads to an 
allocation that achieves a solution to Problem Max 1. 


Description of the market 

The market economy is composed of two types of agents: network (or network 
manager) and users. The network communicates directly with each user, and the 
users do not communicate with one another. The messages exchanged by the market 
agents are service prices and service demands. 

For conceptual clarity we decompose the network manager into two distinct en- 
tities: service provider and auctioneer. The market features and the relations among 
the market agents are as follows: The resource traded at each link is the available 
communication rate. The rate price at link 1 € L is denoted by ;. The prices Az, 
l € L, are set by the auctioneer. Based on 4;, l € L, the service provider sets up 
prices per unit of rate along each path of each multicast tree and communicates these 
prices to the users. Based on the service prices announced by the service provider 
the users demand a certain amount of service from the network in order to maximize 
their utility functions. Based on user demands the auctioneer updates the price per 
unit of rate at each link of the network. 

We make the assumption that the service provider and users are price takers. They 
act as if their behavior has no effect on the equilibrium prices reached by the market 
allocation process. As pointed out in Section 13.3.2, this assumption is justified by 
P.h, that is, the fact that the users are unaware of the type of service received and 
they do not know the number of other users requesting service from the network. 
The price taking assumption and the fact that we try to maximize the users’ utilities 
imply that: (i) the service provider will not attempt to make a profit; and (ii) the 
service prices are directly derived from resource prices. 

Below we describe each type of agent in more detail. 


13 Decentralized Resource Allocation Mechanisms in Networks 245 


Service provider: The service provider receives from the auctioneer a rate price À; 
for each link / of the network. Based on these prices, it has to compute the price per 
unit of rate for each user. 

A major challenge in solving multi-rate multicast problems through pricing is 
the determination of the set of user service prices from the set of link prices. This 
challenge comes from the fact that for each link which is common to multiple users 
of a multicast tree one needs to determine the portion of the price which is incurred 
by each of the users sharing the link. These price shares need to be determined in a 
way that satisfies the informational constraints imposed by the nature of the network 
multi-rate multicast problem. 

In [126] we present a distributed algorithm which for a fixed set of link prices A 
computes a set of link price shares y(A). Based on these price shares (A) the algo- 
rithm also computes the service prices p(r, A) = p(r, y(A)) which generate demands 
that maximize the total user utility along any multicast tree for the fixed set of link 
prices À. 


Users: Users are price takers and request service from the service providers. For each 
user r of the multicast tree m € M the service provider announces a service price 
p(r, A). Based on p(r, A), user r determines its desired service rate by solving: 


t,(p(r, r)) = arpay) — p(r, à) x z}. (13.15) 


Auctioneer: The role of the auctioneer is to regulate the prices of resources, based on 
the aggregate excess demand vector z(A), 


z(à) £ > max 2,(p(r,A)) — c (13.16) 


rERim 
meM i 


at every link l € L. 


The tâtonnement process 


We present a tatonnmment process, described by an algorithm, called Algorithm 
2, that describes how the market works. The algorithm proceeds iteratively as fol- 
lows: 


Step 1: The multicast trees are fixed. 

Step 2: The auctioneer announces prices À := {A;,/ € L} per unit of rate at each 
link of the network. 

Step 3: The service provider receives the link prices \ announced by the auc- 
tioneer. Given the link prices A, the service provider communicates with the 
users via an iterative process in order to determine the optimal service prices 
P(A) := {pi(A),i = 1,2,...,N}. During the iterative process the service 
provider and the users exchange prices per unit of service p and service de- 
mands x(p), with x(p) satisfying (13.15). This iterative process is described in 
detail in [126, Section IV, Appendix A]. During the iterative process between 
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the service provider and users, the auctioneer checks if the sign of the excess 
demand function z(A) is positive on some link or negative on all links. 

Step 4: If at the end of Step 3 z(A) < 0 the process ends. Otherwise the auctioneer 
changes the prices À of resources according to a specific mechanism based on 
Scarf’s algorithm (which is described in detail in [125, 126, 132]), announces 
new prices, say A’, and the process is repeated from Step 3 on. 
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Sign of 
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Fig. 13.2. Market mechanism. 


The steps above are pictorially shown in Fig. 13.2. The figure illustrates the fact 
that the algorithm contains two loops: an outer loop and an inner loop. The inner loop 
describes the iterative process used by the service provider to determine user service 
prices p(A) (hence user demands) for fixed link prices À set by the auctioneer. For 
fixed the inner loop also determines how prices of links that are common to many 
users are optimally shared by these users. The outer loop determines the iterative 
process used by the auctioneer to determine link prices based on excess demand. 
The iterative process of the inner loop is guided by the results of [126, Section II] 
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and is presented in [126, Section IV, Appendix A]. The iterative process of the outer 
loop is guided by Scarf’s Algorithm [120].!° 


Embedding into Hurwicz’s framework 


We embed the multi-rate multicast problem, formulated in Section 13.3.2, within 
Hurwicz’s abstract framework described in Section 13.2.2. 


The resource allocation problem: 


Environment: For the network Problem Max 1, the local environment E” of each user 
i is the set of differentiable concave functions on R+. The local environment of the 
network is the set EN+1! £ {{Ty} x {L} x {M} x {cr }}. The system environment 
is denoted by E := (@;cnE*) 8 ENTE, 


Action Space: The action space is the feasible region of Problem Max 1. 


Goal Correspondence: For problem Max 1, 7 : E —-» A is defined as follows: 
m(e) := argmax Max 1 (e). (13.17) 


The environment, action space and goal correspondence describe the resource 
allocation problem. 


Mechanism specification 


Message Space: In the pricing mechanism proposed in Section 13.3.2 for solving 
Problem Max 1, two types of messages are exchanged among agents: 


e To each user 7 the network communicates a service price p;. 
e Each user į communicates a service demand x; to the network. 


Thus, the message space for Problem Max 1 has dimension equal to the number 
of user demands }` pepy |%m| (where |.| denotes the cardinality of the set) plus the 
number of service prices!!. 


Equilibrium Correspondence: Using the first order optimality conditions, we define 
the equilibrium message correspondence of the first N agents (the users) as follows: 


ð o 
u (Unz) := {(Pr, Ly) = Mlz iE) — Pr < 0; trlo Ur (ar) ~~ Pr) = O}. 
" f (13.18) 


10 Tt may be possible to use algorithms other then Scarf’s at the outer loop, however, to prove 
convergence of such algorithms we may need to impose additional constraints on the users’ 
utility functions (e.g. second order differentiability of the utility functions). 

11 Tn this setup since no two services are identical (i.e. no two services are part of the same 
multicast tree and are delivered over the same links) the number of service prices is equal 
to the number of user demands. 


248 T.M. Stoenescu and D. Teneketzis 


To present the equilibrium message correspondence for the network, we consider 
the following problem: 


max U, (ar) Max 2 
z,,rEeR 
rEeR 
such that: 
XO anm Se, YLE L, Yrim E Rim (13.19) 
mEM 
Trim Z 0, VLE L, Vrim € Rim (13.20) 


where rım denotes a receiver on the m” multicast tree that employs link J. 


Let |M| denote the number of multicast trees in the network. We define the set 
DU) = {(ria,-.-s Tym): rye € Rii, 
tuples, each tuple consisting of one receiver from each multicast tree, and every 
receiver of each tuple is downstream from link / € L on its respective multicast 
tree. We note that the number of elements in (l) corresponds to the number of 
constraints for link / in the set of equations (13.19). We denote by 7; an element of 
P(1), and by rj, a receiver on the mt” multicast tree of r;. Note that if for some 
multicast tree m € M and some link 1 € L, Rim = Í, i.e. link l is not part of 
the multicast tree m, then we let the rım entry of the 7; tuple be empty, i.e. no 
receiver from multicast tree m is assigned to any of the r; tuples. We define the set 
p(l, r) £ {(ri1,; oe rijm) : rE {rias EN Tim}, Thi E Rial <i |M|} to 
be a subset of &(1) where all the tuples contain receiver r. 

Using the notation above we can rewrite equation (13.19) as follows: 





Y trim Se, VIEL, Yr €S). 
mEM 


Then, the Lagrangian function for Problem Max 2 can be expressed as: 


ays X Ula) X Gal YO timo) (13.21) 


rER LEL r,E@(l) mEM 


where ¢ = {Gn : Cri R4,ri P(l), L}. 
Consequently, the first order optimality conditions are: 





Cal XO fan) =0, (13.22) 
ae 
(2 > -5 È &)=0 (13.23) 
T lELr riE€8(l,r) 


where £, is the set of links connecting receiver r to the source. 
From equations (13.22) and (13.23), the equilibrium correspondence for agent 
N +1 (i.e., the network) is: 
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pN**({T} x {L} x {M} x {cz} 
= {(p,z) E€ M| 5 nex Tr <a, Gal 5 Erm — C1) =0, 


mEM mEM 
p= X Ga} (13.24) 
rie (l,r) 
where P := (pi, p2,--- »D| R|) is the vector of prices for the services requested by the 
users, and F := (£1, £2, . . - , |r] ) is the vector of demands requested by the users. 


Outcome Function: In Problem Max 1 the outcome function is h(p, T) := T. 


Key results 


The main features of the market mechanism of Section 13.3.2 are: 


e Jt achieves the solution of the centralized problem Max 1, and satisfies the infor- 
mational constraints (P.f) and (P.g) (see [125, 126]). 
e Itis informationally efficient (see [125]). 


13.3.3 Discussion 


In Sections 13.3.1 and 13.3.2, we presented an approach for optimal resource al- 
location for both unicast with routing and QoS requirements, and multi-rate multicast 
service provisioning. The main features of this approach are: 


(1) The objective to maximize the total value of the network to its users. 

(2) The agents are price takers in the markets in which they participate. 

(3) The users’ utility functions u; are quasi-linear, continuously differentiable, and 
strictly concave. 

(4) There is no cost associated with the supply of network resources. 


We now briefly discuss and critique each one of the features above separately. 
For more details we refer the reader to [125, 126, 129, 132]. 


(1) In the problems considered we assumed that the objective function of interest 
was to maximize the sum of individual network users’ utility functions. It may not 
be obvious why this is a reasonable objective to consider. 

It is important to realize that our point of view is primarily normative, not de- 
scriptive. That is, we have taken a particular objective function—one which we be- 
lieve is often reasonable—and studied whether a network resource pricing scheme 
exists that can achieve an optimum for that particular function, and how one might 
implement that allocation with a market-based algorithm. Thus, we have demon- 
strated the feasibility of using pricing to achieve a particular performance goal. We 
are not claiming that this goal describes any particular actual network environment. 
Nor are we making the stronger normative claim that this objective function should 
be adopted in any particular setting. 
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We do, in fact, believe that maximizing the sum of user utilities is a reasonable 
description for a wide variety of network allocation problems. Suppose we are con- 
sidering a corporate intranet. If the corporation’s overall objective is to maximize 
its profits (in present value), then the appropriate interpretation of our problem is to 
define each user’s “utility” as that user’s contribution to corporate profit as a function 
of the network services it consumes. In other words, the corporation is not (directly) 
interested in how personally happy an employee is with the network, but on how 
much the network enhances the employee’s productivity. Then the sum of user util- 
ities will be the contribution of network services to corporate profits, which is pre- 
cisely the firm’s objective function for this part of the overall management problem. 
Although it may seem difficult to come up with a reasonable representation of the 
effect of network services on each user’s contribution to corporate profits, at some 
level this is precisely the problem corporations need to solve for allocating equip- 
ment, office space, subordinates and so forth to each employee—it is well beyond 
the scope of our research to worry about how the corporation specifically formulates 
these valuation functions. 

Thus, although our method of using prices to allocate network resources cannot 
be directly applied to every allocation problem with any reasonable objective func- 
tion, we believe that it has broad applicability to many existing situations. In any 
case, when our objective function is the desired goal, we have carefully analyzed the 
existence and implementability of a pricing scheme to support that objective. 


(2) For the problems presented in Sections 13.3.1 and 13.3.2, we have imposed the 
price-taking assumption to the agents of the market economy. How useful is the 
price-taking assumption? It is not essential for a proof that an algorithm exists that 
will clear the markets and reach some equilibrium allocation of network resources. 
However, in general, that allocation will not be a solution of our original optimization 
problem. 

As a general matter we could show that equilibrium allocations based on behav- 
ior other than price-taking will lead to less efficient allocations, that is, allocations 
that do not maximize the sum of user utilities subject to the technology constraints. 
Therefore, we did not consider markets in which agents exhibit different types of 
strategic behavior, but limited ourselves to the price-taking behavior that we can 
show can be harnessed to yield a solution to the centralized optimization problem. 

Restricting attention to the price-taking case may not in practice be as restrictive 
as it seems. Consider the example of a corporate intranet with a single monopoly 
provider of resources and services. If the management instructs the resource and 
service provider to behave “as if” it is a price taker (and provides compensation 
incentives that make it in the provider’s best interests to do so) then the desired out- 
come can be achieved. Essentially, this requires compensating the provider based 
on the value of the allocation to the company as a whole, rather than based on the 
provider’s own local “profits.” If the network is to be managed with an agent-based 
control system, the agents should be programmed to act as price takers, whether or 
not other programmable strategies might seem more desirable from the local view- 
point of the agents. 
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In a more open, conventionally market-based system, such as a commercial mar- 
ket for virtual data circuits, it is also possible that at a given moment some partici- 
pants might have some market power, which is to say that they are cognizant of an 
opportunity to improve their position by acting strategically with respect to price- 
setting, rather than as a price taker. In such a setting, it might not be possible to 
directly control behavior to make those participants behave “as if” they are price 
takers. However, if there are no artificial barriers to entry by other providers—for 
example, if it is possible for another competing firm to build an interconnected net- 
work of links with buffers and bandwidth—then it will tend to be the case that in a 
long-run equilibrium surviving agents will be those who behave as price takers (com- 
petition will drive others out of the market). Therefore, we believe there are many 
circumstances under which the conditions will exist, or can be imposed, that are nec- 
essary for our approach to provide an equilibrium that is a social welfare maximizing 
solution of the centralized network problem. 

The price-taking assumption is harder to justify in multi-rate multicast service 
provisioning than in unicast with routing. Treating users as price-takers in Problem 
Max 1 is reasonable under the assumption that they are unaware of the method of 
service delivery (assumption P.h). Without such an assumption, a formulation of 
multi-rate multicast as a public goods problem, albeit a non-typical one, may be 
more appropriate than that of Section 13.3.2. The investigation of Problem Max 1 
without assumption P.h remains an interesting open problem. 


(3) Since we assume that the expenditure of the good under study is a small portion 
of a consumer’s total expenditure, the small size of the market under study should 
lead the prices of the other goods to be approximately unaffected by changes in 
this market. Because of this fixity of other prices, we are justified in treating the 
expenditure on these other goods as a single composite commodity, which we call 
the numeraire. This allows us to express the utility function as a function of the goods 
under study and the numeraire. 

The choice of representing users’ preferences by quasi-linear objective functions 
also imposes the constraint that there are no income effects on network service de- 
mand; that is, changes in income or budget available to the users does not change the 
amount of network services they wish to purchase. This is a typical simplifying as- 
sumption in the economic literature when the budget share of the services of interest 
is small, e.g. when network services are only a relatively small amount of the users’ 
total expenditures. 

The rest of the assumptions made for the utility functions are normal assumptions 
usually made in analysis of economic optimization problems. The continuously dif- 
ferentiable assumption comes from the idea that we may look at a set of users that 
may have similar utilities as a group, and in this case the group utility will be a 
smoothed out version of each user’s utility. Strictly concave assumption is natural 
when we are working with goods that are desirable. 


(4) In both network problems considered in Sections 13.3.1 and 13.3.2, we assumed 
that there is no cost in supplying network resources (bandwidth, buffers, etc.) to 
the market. This cost can be incorporated into our model if we subtract it from the 
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objective function of the optimization problem. We believe that the new problem will 
have the same qualitative properties with the problem presented in this chapter, thus 
it may lead to a similar type of result. 


13.4 Mechanism Design in Networks: 
An Implementation Theory Point of View 


In the previous section we considered a mechanism to be a set of rules which, 
if followed, generate allocations that satisfy a goal correspondence. This kind of 
mechanisms ignore issues of strategic behavior of individual agents. Thus, it may 
not be possible to contractually enforce such mechanisms. To design resource allo- 
cation mechanisms that are contractually enforceable we have to take into account 
the divergence of individual preferences from the overall performance objective. 

In this section we discuss game forms that implement social choice rules char- 
acterized by socially welfare maximizing solutions. We concentrate on Nash imple- 
mentation and relate our discussion to the unicast problem with routing. 

There are two distinct ways in which one can think of implementation of social 
welfare maximization rules in Nash equilibria. We present them below. 

When Nash implementation is the solution concept, an individual (user) needs 
to know not only his own preferences, but everyone else’s preferences so as to de- 
termine his equilibrium message(s). Thus, for Nash implementation purposes in uni- 
cast and routing, an environment of a user is an entire profile of utility functions 
and {{TL} ® {Rr,} ® {K} 8 F,;(Rr,,K) ® {cx}}, defined in Section 13.3.2. 
Consequently, the space of user’s environments is 


Ê := (@ienE’) 9 {{TL} 9 {Rr} 9 {K} 9 Fu (Rra: K) {cx}}, (13.25) 


where all the components of the right-hand-side of (13.25) are defined in Section 
13.3.2. When the action space A is the feasible region of Problem P, and the message 
space M is 


M=ÊxA xN (13.26) 


where N is the set of natural numbers, then the goal correspondence 7 : @N_, Et —» 
A , described by the centralized solution of Problem P, can be implemented in Nash 
equilibria by a game form (M, h) where the outcome function h is defined in [84, 
Theorem 3]. Such an implementation is possible for the following reason. For the 
unicast problem with routing, 7 is a Pareto correspondence; Pareto correspondences 
are monotonic and possess the no veto power property [83, 84]; therefore, 7 can 
be implemented in Nash equilibria by the aforementioned game form whenever the 
number of users in the network is greater than or equal to three [83, 84]. However, 
the game form described above is infinite dimensional. Thus, the approach above to 
Nash implementation leads to game forms that are infeasible on information grounds. 
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An alternative way of proceeding with Nash implementation of the goal corre- 
spondence m (which could potentially result in game forms with finite dimensional 
message spaces) is the following. Consider that users know their own environment, 
but not those of other users or the network. All users are involved in an unspecified 
message exchange process in which they grope their way to a stationary message 
and in which the Nash property is a necessary condition for stationarity. Experi- 
mental evidence [124] has shown that such an approach to Nash implementation is 
reasonable. 

An important open issue within the context of the second approach to Nash im- 
plementation is: What is the minimum dimensionality of the message space of game 
forms that implement social welfare maximizing rules (e.g. 7) in Nash equilibria? It 
is expected that, in general, an implementing mechanism with the Nash property in 
equilibrium messages will require a larger message space than the one that suffices 
for decentralized realization without regard to individual incentives. Reichelstein and 
Reiter [107] have shown that the statement above is true in the case of Nash imple- 
mentation of Walrasian allocations in exchange environments. The following exam- 
ple from [107] illustrates the fact that Nash implementations require larger message 
spaces than the corresponding decentralized realizations. 


Example 4.1 

Consider a resource allocation problem with two agents {1,2} and two goods 
{X,Y}, where good X represents a desirable service and good Y has the interpre- 
tation of money. Assume that the agents’ preference over the goods are described, at 
least locally, by quasi-linear!* convex utility functions of the form 


2 
U;(x, ylei) & ei x £ — S +y, e€ Ei, i€ {1,2}. 

The agents’ private objective is to maximize their individual utility function, 
while the social objective is to achieve a resource allocation which maximizes the 
sum of individual utility functions. 

Assume that the goods are distributed among the agents and the agents are per- 
mitted to trade. From the realization theory point of view there exists a goal realizing 
mechanism with a message space of dimension two [43]. Specifically, in the case in 
which both agents are truthful, the mechanism in which agent 1 sets the price for 
good X and agent 2 makes a request!? for good X based on the price set by agent 1 
is social welfare maximizing. 

In [107] the authors show that there is no mechanism of dimension 2 which 
implements in Nash equilibria the social welfare maximizing rule for this problem. 
In particular, they show that the pricing mechanism where agent | sets the price 


12 Let X := {21,22,...,¢1} be a set of commodities. A function U (x) is called quasi- 
linear with respect to commodity L if it is of the form U(x) = U (x1, £2,..., £ L-1) +22. 
Commodity xz is called the numeraire commodity. The numeraire commodity generally 
has the interpretation of money. 

13 Tf agent 2 makes a negative request for good X it means that he would like to sell that 
amount of good X to agent 2. 
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for commodity X and agent 2 makes a request for this commodity based on the 
price, does not have a Nash equilibrium which is social welfare maximizing. The 
authors present a mechanism of dimension 3 which implements the social welfare 
maximizing rule in Nash equilibria. This mechanism works as follows: 

The agents message spaces are described by 


M = {m|m, € R4}, (13.27) 
Mz = { (m2, ms) |(m2,ms3) € RÊ }. (13.28) 


The amount of commodity X exchanged is characterized by the outcome func- 
tions 


hī (m) = mı — ma, (13.29) 
h3 (m) = mə — mı, (13.30) 


while the amount of the numeraire commodity Y exchanged by the agents is de- 
scribed by: 


hł (m) = —m3(m1 — mə), (13.31) 


h3(m) = =mı (m2 — mı) — (mı — m3)’. (13.32) 


In this mechanism mg has the interpretation of the price of good X. Agent 1 
maximizes his utility function based on mg and sends his message mı to agent 2. 
Agent 2 does not use mg as the price for commodity X, but rather it uses message 
mı. Based on mı, agent 2 maximizes his utility by choosing mz. Agent 2 receives 
a quadratic penalty for announcing a price which is not equal to mı. This penalty, 
along with the fact that he does not set his own price for good X, forces agent 2 to be 
truthful in his messages. In [107] the authors prove that this mechanism implements 
the social welfare maximizing rule in Nash equilibria. 














Recent game theoretic studies in network unicast problems (without routing)[35, 
50-52, 114] have shown that: When the dimension of the message space of the game 
form is the same as that of the pricing mechanism which suffices for decentralized 
realization, the game form does not implement the social welfare maximizing rule. 
Specifically the Nash equilibria determined in [35, 50-52, 114] are not social welfare 
maximizing. 

Example 4.1, the results in [35, 50-52, 114], as well as the results on implemen- 
tation in other solution concepts such as Bayesian Nash equilibria [98], and refine- 
ments of Nash equilibria (specifically, subgame perfect equilibria [87], and undomi- 
nated Nash equilibria [18]) reveal that: 


I1: Games that are induced by game forms whose message space has the same di- 
mension as that of the standard pricing mechanism have multiple equilibria, 
some of which do not result in welfare maximizing solutions. Consequently, 
such game forms can not implement (in the corresponding solution concept) so- 
cial welfare maximizing rules. 
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I2: Implementation of social welfare maximizing rules in some solution concept (cf. 
Section 13.2.3) requires a message space the dimension of which is larger than 
that of the message space which suffices for decentralized realization. 

I3: The increase in the dimension of the message space must accomplish the follow- 
ing: 

1. It must eliminate the equilibria that do not result in welfare maximizing 
allocations (cf. I1). 

2. It must maintain the equilibria that result in welfare maximizing allocations. 

3. If must not introduce additional equilibria, unless these equilibria result in 
welfare maximizing allocations. 

4. It must induce price-taking behavior among the players (users). 


We have the following conjecture concerning the Nash implementation of the 
centralized solution of the unicast network resource allocation problem with routing 
and QoS requirements. 


Conjecture: In the case of unicast service provisioning, with N users and L services, 
there exists a game form which implements the centralized solution of Problem P in 
Nash equilibria and has a message space of dimension equal to the dimension of the 
pricing mechanism plus (1. Any mechanism with a message space of smaller di- 
mension can not implement the centralized solution of Problem P in Nash equilibria 
[127]. 


13.5 Conclusion 


Our goal was to discuss: (i) how decentralized network resource allocation prob- 
lems fit within the context of mechanism design; and (ii) how mechanism design can 
provide guidelines for the determination of resource allocation strategies that realize 
(in an informationally efficient manner) social welfare maximizing resource alloca- 
tion rules, and implement them in some appropriate behavioral equilibrium concept 
(e.g. Nash equilibrium) in an informationally efficient manner. Our discussion was 
guided by two classes of network resource allocation problems (unicast with routing 
and QoS requirements, and multi-rate multicast) that received significant attention in 
the engineering world. The results we presented reveal the connection between net- 
work resource allocation and mechanism design. The discussion also revealed that: 
(1) the aforementioned network problems are better understood from the realization 
theory point of view than from the implementation theory viewpoint. (2) A formula- 
tion that is appropriate for the multi-rate multicast problem when the users are aware 
of the method of service delivery remains an interesting open problem. 

In our opinion, two problems of fundamental importance are: (1) the characteri- 
zation and classification of mechanisms in terms of their “communication” and “in- 
formation processing” requirements; and (2) how can the theory of implementation 
guide the design of minimal message space mechanisms that implement, in some 
appropriate solution concept, social welfare maximizing network resource allocation 
rules. 
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Summary. This chapter provides a review of the research on design of control systems for 
automated highway applications that has been conducted at the California PATH Program of 
the University of California, Berkeley. The primary focus is on the research areas that Pravin 
Varaiya led directly, particularly while serving as PATH Director from 1994 to 1997. He and 
the researchers working directly with him made significant contributions to the definition of 
control strategies for coordinating maneuvers of neighboring vehicles and for managing the 
flows of vehicles along network links, while also developing the hybrid system modeling and 
simulation tools needed to evaluate the effectiveness of these strategies. 


14.1 Introduction 


The concept of applying automation technology to the driving of road vehicles 
has been considered by the transportation community since the late 1930s, when a 
group of futurists collaborated with General Motors on the development of the “Fu- 
turama” exhibit for the 1939-40 New York Worlds Fair. World War II interrupted 
developments until the late 1940s, when Vladimir Zworykin of RCA Laboratories 
revived the concept and started developing ideas about how to implement it tech- 
nically [1]. His work stimulated further activities at General Motors in the 1950s 
and early 1960s, leading to the development of concept cars, test track prototypes, 
and publicity films highlighting the safety, comfort and convenience of the “elec- 
tronic highway” [2]. After the General Motors research focus was reoriented toward 
nearer-term targets, academic research on highway automation thrived under federal 
and state sponsorship of the work of Prof. Robert Fenton at Ohio State University 
from the mid 1960s until about 1980 [3]. 

The California Department of Transportation (Caltrans) became interested in the 
potential of highway automation for increasing freeway capacity in the mid-1980s, 
when it was evident that it would no longer be possible to meet the state’s transporta- 
tion capacity needs by building additional freeways. In 1986, Caltrans sponsored the 
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University of California’s Institute of Transportation Studies in the creation of a new 
research program called PATH (initially, Program on Advanced Technology for the 
Highway, then renamed as Partners for Advanced Transit and Highways in 1992), 
aimed at developing and testing advanced technologies to reduce traffic congestion 
[4]. 

When the new PATH Program began, Pravin Varaiya was one of the first faculty 
members to become involved. This was an opportunity to develop real-world applica- 
tions, with direct societal benefits, based on some of his existing research interests in 
distributed and hybrid control systems. His early involvement provided much of the 
credibility the new program needed to encourage other Berkeley faculty members, 
including Charles Desoer, Jean Walrand, Roberto Horowitz, Masayoshi Tomizuka, 
Karl Hedrick and Shankar Sastry to follow. He also brought a very talented group 
of students and post-doctoral researchers to work with him on the PATH research 
projects. 


14.2 Hierarchical Architecture for Automated Highway Systems 


The vital first step in developing the vibrant PATH research program on Auto- 
mated Highway Systems (AHS) was Pravin Varaiya’s definition of a hierarchical 
architecture that could accommodate the demanding functional requirements of the 
system. This architecture [5]-[7] has served as the foundation for nearly 15 years of 
intensive research at PATH, and has been adopted by many other researchers as well. 
It has the great virtue of being simple in structure, but so carefully defined that it can 
be applied to a broader range of transportation applications than AHS. It was essen- 
tial to begin with an architectural framework like this in order to make it possible to 
decompose the complexities of AHS into components and subsystems of manage- 
able scope and complexity, so that they could be designed, analyzed and tested by 
relatively small research teams using desktop computing resources. Varaiya’s intro- 
ductory paper about the architecture and the control problems associated with AHS 
in the IEEE Transactions on Automatic Control [7] has become essential reading 
for researchers working in this field, and has helped point many researchers toward 
fruitful research topics. 

This architecture, shown in Fig. 14.1, provides a logical decomposition, with the 
actions involving the fastest updates but the smallest information span and spatial 
scope toward the bottom, while those involving the broadest spatial scope and in- 
formation span but the slowest updates are toward the top. The information flows 
between adjacent layers are clearly defined, making it easy to decouple the layers 
logically for purposes of design and evaluation, and keeping each layer within a 
small enough range of time scales and physical scope that it can be modeled without 
excessive difficulty. Trade-offs involving allocations of functions among individual 
vehicles, groups of vehicles and the infrastructure can be determined logically within 
this framework. 

The physical layer is where we have the in-vehicle sensing of position relative 
to the lane and other vehicles, the actuation of steering, engine and brakes, and the 


14 Automated Highway Systems Research: The Influence of Pravin Varaiya 269 


Section) 


















(Adjacent piini 
Vehicles) Coordination 
(Vehicle) Regulation Regulation Regulation 





Physical Physical Physical 


Fig. 14.1. Hierarchical architecture for vehicle automation. 


driver-vehicle interface for transitions to and from automatic control. At the regula- 
tion layer, we find the classic closing of the vehicle control loops for automatic steer- 
ing control and the control of vehicle speed and spacing relative to the preceding ve- 
hicle(s). The coordination layer is where vehicle trajectories are planned, maneuvers 
are coordinated among adjacent vehicles (lane changing, joining a platoon or split- 
ting from a platoon), and information is exchanged among groups of vehicles about 
traffic and road conditions. At the link layer, the emphasis shifts toward aggregate 
traffic flows rather than individual vehicles or groups of vehicles, with consideration 
of issues such as managing flows around incidents, balancing traffic across lanes, 
metering the entry rate of vehicles to maintain good local traffic flow and assigning 
suitable speed limits to each lane. Finally, at the network layer we have the most ag- 
gregate decision making about balancing traffic among alternative routes, rerouting 
traffic to avoid congested locations, metering access to manage system capacity, and 
managing serious incidents. 

Fig. 14.1 cannot show the full spreading out of the architecture from top layer to 
bottom layer because of space constraints, but it is important to note that there is a 
significant spreading from each layer to the next. A large urban region would have 
a single network-layer operation, but potentially hundreds of links. Each link, with 
a length of perhaps | to 2 km, could have dozens of clusters of closely-coordinated 
vehicles, each of which could have a dozen or more vehicles. 

Since AHS is a safety-critical system, extraordinary measures need to be taken 
in its design, evaluation, testing and certification in order to provide assurances that 
people cannot be hurt by it. The hierarchical architecture simplifies this problem by 
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confining the safety-critical elements to the three lowest layers, so that the higher 
layers can be developed without the extra burden of safety assurance. 

In the remainder of this chapter, attention is primarily focused on the coordina- 
tion and link layers of the architecture and on the modeling and simulation tools 
for testing system concepts and designs, where Pravin Varaiya’s activity has been 
concentrated and where his influence on other researchers has been most significant. 


14.3 AHS Coordination Layer Research 


At the coordination layer, the primary issues are the coordination of maneuver 
plans among vehicles, which have implications not only for capacity but also for 
the safety of the vehicle operations. The level of coordination may be very low for 
automation schemes based on use of autonomous vehicles, but it is important to 
recognize that a certain degree of coordination already exists in conventional driving, 
implemented through drivers’ use of the horn, directional signals and brake lights, but 
sometimes more subtly through use of gestures, facial expressions and examination 
of the other driver (“does he see me?”). The design of an automated highway system 
must ensure that it is no less well coordinated than today’s driving. 

Pravin Varaiya launched research on coordination layer design by defining a 
comprehensive set of automated vehicle maneuver protocols for normal driving con- 
ditions as 13 state machines, specified in a formal language, “COSPAN,” and verified 
in [8]. This was an extremely important milestone, because it showed that all driving 
maneuvers could be broken down into a very limited set of simple elemental ma- 
neuvers: steady-state cruising in lane, lane changing, joining a platoon (initially, and 
misleadingly, called “merging’’), and splitting from a platoon. In order to simplify 
the protocol verification, some simplifying constraints were applied, specifically re- 
quiring that a platoon could only engage in one maneuver at a time and defining the 
first vehicle in the platoon (the platoon leader) as the master agent for the platoon as 
a whole. With these constraints, the full set of protocols, incorporating about 500,000 
reachable states and ten million transitions, was verified for completeness and cor- 
rectness using the COSPAN software verification tool. These protocols became the 
basis for much subsequent research on both coordination and link layer issues at 
PATH. The coordination layer was defined as a discrete controller supervising the 
continuous regulation-layer controllers, together comprising a hybrid system. 

The basic architectural framework of [5]-[7] has been extended to address ad- 
verse operating conditions and most faults that may occur within the automated high- 
way system by Lygeros et al. [9]. They identified the need to accommodate faults of 
six different levels of severity, and defined strategies to address those faults, which 
would be implemented using six new emergency maneuvers to augment the three 
basic maneuvers that were specified in [8]. The fault classes, in descending order of 
severity, [with strategies in brackets] are: 


e Vehicle stopped or must stop [gentle stop, crash stop, aided stop] 
e Vehicle needs assistance to exit [take immediate exit, take immediate exit with 
escort] 
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e Vehicle needs no assistance to exit [take immediate exit/normal] 
e Vehicle does not need to exit 

e Infrastructure failure (not safety critical) 

e Driver-vehicle interaction failure 


The strategies for handling these faults were defined based on an intuitive under- 
standing of the operation of the automated highway system, trading off system safety 
and performance, and relying on use of communication for cooperation among ve- 
hicles. Significantly, this study did not consider the higher severity class of faults in 
which a vehicle executes a dangerous maneuver (sudden swerve, accelerate or decel- 
erate), perhaps as a consequence of errors in software design or implementation. 

Coordination among vehicles and between vehicles and the roadway infrastruc- 
ture requires wireless communications, so another important milestone was the defi- 
nition of the communication protocols needed to implement the maneuver protocols 
[10]. Even though Pravin Varaiya was not an author of [10], he was certainly ac- 
knowledged as an essential influence on this research. The communication protocols 
were designed to maintain safety while minimizing the impacts of faults on high- 
way operations, but also recognizing that it is not possible to completely preclude 
injurious crashes when multiple simultaneous faults occur. This chapter included 
verification to show that the protocols were logically correct and would achieve their 
desired objectives. 

Considerable research has been done on the design and evaluation of the coordi- 
nated vehicle maneuvers, within the general protocol framework defined by Pravin 
Varaiya in [8]. Pravin Varaiya led the initial research on designing the entry and exit 
maneuvers, in concert with the design of entry and exit ramp geometries [11]. This 
research showed some of the limitations to the use of a “transition lane” between 
automated and conventional lanes, pointing toward the need for dedicated entrance 
and exit ramps for the AHS lanes. 

The behavior of the lead vehicle of a platoon needs to be defined at the coordina- 
tion layer as well as at the regulation layer, since the platoon leader is responsible for 
the maneuver coordination of the platoon as a whole. Roberto Horowitz and his re- 
search team described an initial concept for the platoon leader control in [12], based 
on definition of an allowable range of speed differences between consecutive pla- 
toons (called the “safety region”) in order to avoid high-speed crashes in the event 
of a hard-braking failure. This study included simulations of two-platoon join and 
split maneuvers under a variety of conditions, showing how the maneuvers need to 
take more time in order to reduce crash threats. As an initial study, this work did 
not address details of vehicle dynamics or sensor data imperfections. Subsequent re- 
search by the Horowitz team at PATH [13] led to more detailed definition of both the 
coordination and regulation layer controllers to avoid any crashes at impact speeds 
above a specified threshold and to preclude any crashes from propagating from one 
platoon to the next. This led to the definition of safe inter-platoon spacings and lim- 
itations on the maneuvering of the platoon leader, based on consideration of both 
safety and system performance. Controllers were defined for platoon joining, based 
on minimizing the time needed for the maneuver, subject to constraints on acceler- 
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ation, jerk and maximum permissible crash impact speed. Simulations of controller 
performance showed the join maneuver times to range from 14.7 to 19.4 seconds 
when joining from initial gaps to 30 m to 60 m respectively, with a brake response 
delay time of 20 ms. 

The coordination layer is where the safety and capacity considerations of vehicle 
automation intersect, so it is important to develop a strong understanding of the trade- 
offs between those vital performance measures. The definitive study of these issues 
until now was conducted by PATH researchers recruited by Pravin Varaiya, under 
the auspices of the National Automated Highway Systems Consortium [14]. This 
paper shows a systematic comparison between conventional manual driving, with 
both alert and typical drivers, and several levels of automated driving (autonomous 
individual vehicles, two intermediate levels of cooperation, and closely-coupled pla- 
toons). These comparisons were based on a “worst case” hazard of a forward ve- 
hicle suddenly braking with maximum effort, and then considered the probability 
and severity of subsequent collisions. The faster responses of the automated systems 
generally caused them to have much lower collision probabilities than the manual 
drivers, but with only somewhat milder severity for the cases in which collisions 
were unavoidable. 

The automated platoon systems were significantly different from the autonomous 
individual vehicles and more loosely coupled cooperative vehicles, because they are 
designed to make a different trade-off between crash frequency and severity. The 
very close intra-platoon separations make it impossible to avoid collisions under the 
severe braking scenario, but they ensure that these collisions occur at very low impact 
speeds, where injuries are extremely unlikely. Table 14.1 and Fig. 14.2 show the 
probability density functions of collisions in platoons with nominal vehicle spacings 
from | m to 10 m. The composite measure of collision severity is the expected value 
of the square of the impact speed, which is proportional to the kinetic energy of the 
crash. The table shows that the shortest spacings have a higher probability of crash, 
but much lower severity of crash. 

The overall trade-off between highway capacity and safety required an extremely 
complicated analysis in [14]), which is summarized in Fig. 14.3. In this figure, the 
crash severity is plotted vertically, with the “best” (least severe) at the top. In this 
case, the mean-square crash speed is plotted as an expected value to incorporate both 
the probability and severity of the crash. At the higher values of highway capacity 
(basically, all values above the current highway capacity of about 2000 vehicles per 
lane per hour), the platoon cases all show lower crash severities than the individual 
automated vehicles. This is because in order for the individual vehicles to reach those 
higher capacity levels they would have had to operate at separations short enough to 
produce high-speed impacts when crashes could not be avoided. The more frequent 
crashes of the platooned vehicles were much less severe than the crashes involving 
the other kinds of automated vehicles, which operated at larger nominal separations. 
This analysis, despite its complexity and sophistication, was still only able to address 
the first crash within the platoon, and has not yet been extended to address subsequent 
crashes that could occur between other pairs of vehicles within the platoon. 
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Intra-Platoon |Total Probability |Expected Collision Severity Surro- 
Spacing (m) Jof Collision gate 
(AY collision)? 5 (m?/s?) 
1 0.73 2.94 
2 0.62 5.13 
3 0.58 7.38 
4 0.54 9.87 
5 0.51 12.6 
6 0.48 15.6 
T 0.45 18.9 
8 0.42 22.4 
9 0.39 26.2 
10 0.36 30.2 

















Table 14.1. Probability and severity of collisions. 
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Fig. 14.2. Probability density function of collision severity in sudden hard-braking scenario, 
as a function of intra-platoon vehicle separation (from Reference [15], also published in Ref- 
erence [14]). 
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Fig. 14.3. Trade-off between highway capacity and safety for several automated highway op- 
erating scenarios (from Reference [15], also published in Reference [14]). 


14.4 AHS Link Layer Research 


Link layer operations will become important as soon as the first section of au- 
tomated highway lane(s) needs to be designed. The most basic level of link layer 
analyses address the capacity increases that could be attained from automation of 
vehicle operations. Lane capacity analyses date back to the 1960s and 1970s for 
application to automated guideway transit (AGT) systems. However, much more so- 
phisticated and higher-fidelity studies were performed by PATH researchers for the 
National Automated Highway Systems Consortium in the mid 1990s, based on con- 
sideration of the braking capabilities of production vehicles (automobiles, buses and 
trucks). Detailed results on the “pipeline” capacity of a simple automated lane, with- 
out introduction of merging conflicts, were reported in [15], covering a wide range 
of assumptions and operating conditions. These showed the effects of different lev- 
els of cooperation among the automated vehicles, with the close-coupled platoons 
providing the highest capacity by a large margin. Introducing even relatively small 
percentages of buses and trucks to the flow of passenger cars can significantly reduce 
the achievable capacity because of the poorer performance of these heavier vehicles 
and their consequent needs for larger safety spacings. 
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Fig. 14.4 shows a representative example of the results from [15],[16], indicat- 
ing how the lane capacity depends on speed and the number of vehicles (passenger 
cars, in this case) in the automated platoon. Comparable results were developed as 
a function of the size of the intra-platoon gap, as well as other relevant parameters 
such as the percentages of buses and trucks in the traffic stream, and alternative con- 
cepts that would deny access to the automated lanes by vehicles that could not meet 
certain minimum emergency braking requirements. Nevertheless, this study showed 
the potential to reach ideal lane capacities in the range of 6000 to 7000 passenger 
cars per hour before consideration of merge conflicts. More realistic assessments of 
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Fig. 14.4. Ideal (“pipeline”) automated highway capacity as a function of platoon length and 
cruising speed (from Reference [15], also published in Reference [16]). 


AHS lane capacity need to include consideration of conflicts with traffic trying to 
enter at on-ramps, which can further limit capacity, and of the allocation of traffic 
to lanes in multi-lane AHS applications, particularly in response to incidents that 
may block one or more lanes. Pravin Varaiya led the first comprehensive treatment 
of these issues, using a mixture of microscopic and macroscopic simulations [17], 
which showed successful strategies for assigning target speeds by lane and for re- 
covering from incidents. Almost all subsequent link-layer studies can trace some of 
their thinking back to this original work. 
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After merging, the other important link layer control function is the assignment 
of vehicles to lanes in a multi-lane automated highway system. Pravin Varaiya led 
the development of the basic approach for allocating lane resources to vehicle ac- 
tivities by using the product of space and time needed for each activity [18]. This 
paper used a linear programming approach to maximize lane capacity, but without 
considering the effects on travel time. Its case studies addressed the capacity of a 
single-lane AHS and a merge junction, considering separate examples of manual 
driving, adaptive cruise control and automated vehicles in 15-car platoons. The re- 
sults were heavily dependent on the assumptions that were applied about lower-level 
controller performance and vehicle maneuvering protocols, which were represented 
in the space estimates for the activities (entry and exit each needing 65 m, platoon 
join and split maneuvers each needing 28 m, and steady cruising in platoons, ACC 
and manual driving respectively needing 10 m, 40 m and 50 m). Building from that 
starting point, a pair of papers by Hall and Li addressed the issue of multiple ve- 
hicle classes with different performance characteristics (cars and buses) sharing use 
of the same AHS facility [19],[20]. Starting from the fundamental safety consider- 
ation that vehicles of such different mass and performance should not be combined 
within the same close-formation platoon, they also considered the possibility of fur- 
ther segregating platoons by destination, but found that that could produce a signif- 
icant capacity loss [19]. They evaluated a large set of merge entry rules in Monte 
Carlo simulation, with separate entry queues for the cars and buses, and estimated 
the queue lengths and times (and queuing ramp lengths needed) for each [20]. 

Building on the prior Varaiya work, Horowitz’ research team addressed lane al- 
location by seeking to control the density and velocity profiles in each lane of a 
three-lane AHS [21], using information local to the link communicated among all 
the local vehicles and the roadway infrastructure. 

When an automated highway facility is physically segregated from the normal 
highway system, it is necessary to provide special access provisions for emergency 
vehicles to handle crashes, medical emergencies and law enforcement problems. This 
can be challenging in a physically constrained location when traffic is running close 
to capacity. Toy et. al [22] defined three types of maneuvers that could be used to 
help emergency vehicles gain priority access to a problem location, on a highway 
with at least two lanes but no shoulder, by getting the other vehicles out of the way, 
and used a flow simulation to show their effectiveness. 

An overall link-layer traffic flow control strategy was defined by Alvarez, et.al. 
[23] as the culmination of the link-layer design work initiated by Pravin Varaiya. This 
strategy was shown via simulation to be able to stabilize vehicle density and flow rate 
around the desired profiles by using speed and lane changes as the control signals. 
This in effect showed that a viable link layer could be designed, complementing 
other analogous demonstrations of viability of the lower layers in the hierarchical 
AHS architecture. 


14 Automated Highway Systems Research: The Influence of Pravin Varaiya 277 


14.5 Modeling and Simulation Tools for AHS Design and 
Evaluation 


The design and evaluation of the AHS coordination and link layer concepts and 
designs could not be conducted entirely in the abstract or using existing analytical 
tools. From the earliest stages of the PATH Program, we knew that we were going 
to need new modeling and simulation tools in order to represent AHS systems with 
sufficient fidelity and efficiency to support multiple iterations of system design. The 
existing tools were not well suited to support this work for a variety of reasons: 


e The existing traffic models are founded on approximations of the behavior of 
drivers, whose car-following and lane changing behaviors are very different from 
what the AHS should be doing if it is going to improve highway efficiency, ca- 
pacity and safety; 

e The existing transportation network planning models are still based on higher- 
level abstractions of driver behavior and traffic flow, and cannot represent the 
significantly higher capacity that AHS can provide; 

e The existing vehicle dynamics and control models were based on continuous 
dynamic models, but did not address discrete changes of operating mode effec- 
tively; 

e There were no models available to represent the new phenomena associated with 
active coordination of vehicle maneuvers at the AHS coordination layer; 

e The spatial scope and time scales of the phenomena that need to be represented 
cover an exceptionally wide range — from an individual vehicle to an entire 
metropolitan region, and from the millisecond time scale inside the inner control 
loops onboard a vehicle to a full 24-hour operating day for the transportation 
network. 


Pravin Varaiya recognized AHS as a useful application case study for his re- 
search on hybrid systems, because of a variety of complexities inherent in the AHS 
application: 


e vehicles exhibiting continuous, nonlinear dynamic behavior; 

e discrete transitions associated with vehicle maneuvers; 

e coordination of maneuvers of neighboring vehicles, representing individual agents 
interacting in complex ways; 

e vehicles frequently entering and exiting any defined zone of interest; 

e safety-critical consequences of failures necessitating very high confidence in sys- 
tem design; 

e realistic system implementations involving large numbers of vehicles and cou- 
pled interactions at several levels requiring a computationally efficient simulation 
in order to be represented effectively. 


Varaiya’s interests and capabilities in hybrid systems matched well with the needs 
of the National Automated Highway Systems Consortium for a suite of modeling 
and simulation tools to support its work. The substantial funding available for de- 
velopment of these tools supported an active cluster of hybrid system research led 


278 S.E. Shladover 


by Pravin Varaiya, involving a dynamic group of researchers including Akash Desh- 
pande, Aleks Göllü, Farokh Eskafi, Michael Kourjanski, Marco Antoniotti, Luigi 
Semenzato and Mireille Broucke. 

The original framework for object-oriented simulation of hybrid systems, based 
on distributed control agents working within the previously-defined layered architec- 
ture, was called SmartDb, indicating the importance of its underlying database [24]. 
The first application of this to represent the specific set of AHS maneuvers that were 
defined in [8] was called SmartPath [25]. SmartPath featured an attractive graphi- 
cal animation of vehicle maneuvers, but it did not have the flexibility to enable it 
to represent a broader range of AHS alternatives than those featured in [8]. Never- 
theless, the lessons learned in its development were of great value in facilitating the 
development of the more general hybrid system simulator to follow. 

The SmartAHS simulator was developed by the Varaiya-led research team, based 
on the Object Management Systems (OMS) object-oriented approach [26]. The dy- 
namically reconfigurable hybrid system aspects of the AHS phenomenology moti- 
vated the development of a new language for representing more general hybrid sys- 
tems, initially named the “Hybrid Systems Tool Interface Format” (HSTIF). Since 
this was unpronounceable, the team decided to scramble the sequence of letters in 
the name to produce the much easier-to-say name of SHIFT [27]-[28]. Even though 
it was developed for AHS applications, it has subsequently been used by a variety 
of researchers to represent a much wider range of dynamic systems, including auto- 
mated submarines, air traffic control systems and material handling systems. 


14.6 Continuing AHS Research Needs 


Although great progress has been made in the research directed toward highway 
automation, there are still some significant research challenges remaining. 


14.6.1 Network layer 


The network layer of an automated highway system is not dramatically different 
from a conventional transportation network, except that the availability of automatic 
control of vehicle motions makes it possible to deterministically assign vehicles to 
system-optimal routes, rather than depending on the more uncertain choices that are 
made by human drivers. This could actually make the network-layer control simpler 
than it is for conventional roadway networks. 

The larger challenge at the network layer is in predicting the transportation sys- 
tem impacts of the implementation of a high-capacity automated highway. The un- 
certainties here are associated with individuals’ choices about whether to purchase 
suitably equipped vehicles and when they choose to use those vehicles on the auto- 
mated lanes for specific trips. Those decisions in turn affect the long-term patterns 
of land use and the locational choices that families and businesses make based on 
their perceptions of the relative accessibility of alternative locations. The extent to 
which the automated highway ultimately reduces traffic congestion will depend on 
these (inevitably subjective) decisions about lifestyles and travel behavior. 
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14.6.2 Link and coordination layers 


The research issues at the link and coordination layers are closely related to each 
other because of the importance of the vehicle maneuvering and communication pro- 
tocols to both. The first-generation protocols that were described by the Varaiya team 
in [8] have served as an important foundation for a full generation of research on 
the operation of these two layers. However, those protocols were built on some ex- 
tremely conservative assumptions that need to be re-examined in the interest of im- 
proving system efficiency. In particular, the following restrictions tend to impede the 
efficient entry and exit of vehicles to and from an automated lane: 


e A platoon may be engaged in only one maneuver at a time. 

e A vehicle must complete splits from the other members of a platoon before 
changing lanes. 

e A vehicle must complete its lane change before beginning to join a platoon. 

e Split and join maneuvers must transition the full range from the nominal intra- 
platoon vehicle spacing (in the range of 2 to 4 m) to the nominal inter-platoon 
spacing (in the range of 60 m), thereby consuming considerable space and time. 


These restrictions simplified the verification of the original protocols and helped 
to ensure their safety, but they also impede the use of promising strategies for coop- 
erative merging and exiting maneuvers (merging vehicle tagging onto the end of a 
passing platoon or exiting vehicle simply lane changing out of its platoon when a par- 
allel off-ramp becomes available). A second generation of protocols that improves 
efficiency while maintaining safety needs to be developed, evaluated and verified. 

Once the second-generation protocols are defined, many of the link and coordi- 
nation layer studies will need to be revisited and updated to show how much system 
efficiency can be improved. 

The communication systems to implement coordination among vehicles and be- 
tween vehicles and the roadway infrastructure also need further development. This 
is progressing rapidly with the current high interest in “dedicated short-range com- 
munications” for safety-critical transportation applications, and is likely to lead to 
commercially available products within a few years. 


14.7 Concluding Remarks 


Fifteen years ago, when the PATH research program was starting to ramp up 
its activities, the state of knowledge on the “system level” issues associated with 
highway automation was very primitive. Research had already been done on the 
regulation-layer control of the motions of individual vehicles, including some experi- 
ments on test tracks. However, very little thought had been devoted to the higher-level 
issues associated with how to coordinate the maneuvers of neighboring vehicles to 
improve their safety and efficiency and how to manage flows of automated vehicles 
through a roadway network. 
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The research that Pravin Varaiya led at PATH during the 1990s made major con- 
tributions toward showing the technical feasibility of automated highway systems 
and bringing many other researchers to focus their attention on the related research 
challenges. The distributed, hierarchical architecture that he defined for AHS (in- 
deed, for all of Intelligent Transportation Systems) remains the foundation for most 
current thinking on the subject, and has served many researchers well by helping to 
decompose a large and complicated system of systems into manageable-size pieces. 
He and his research colleagues have also filled in many of the important gaps in the 
middle layers of this architecture, showing that: 


e all automated vehicle maneuvers can be built up from a common set of simple 
building blocks; 

e these maneuvers can be verified systematically to prove their correctness and 
safety; 

e coordination protocols can be defined to enable vehicles to cooperate with each 
other efficiently and safely; 

e the automated vehicle operations can improve highway lane capacity signifi- 
cantly compared to conventional traffic, without compromising safety; 

e link-layer protocols can help automated systems maximize their efficiency even 
when performance is degraded by vehicle failures or other incidents; 

e entry to and exit from high-capacity automated highway lanes should be effected 
by means of dedicated ramps rather than by merging across adjacent lanes. 


Furthermore, his research team has provided the modeling and simulation tools 
to enable other researchers to explore their own alternative ideas about how to de- 
sign AHS operations, as well as other complex dynamically-reconfiguring hybrid 
systems. 

Roberto Horowitz and Pravin Varaiya gave their overall perspective on the AHS 
control system design work that the PATH team did in a paper published in 2000 
[29], but even in that overview they introduced some new results on the safety of 
platoon join maneuvers and link layer control of traffic flow density. Now we have a 
few more years to look back on those accomplishments and identify what still needs 
to be done to make progress toward deployment of viable automated transportation 
systems. 
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Summary. Recurrent and non-recurrent congestions on freeways may be substantially re- 
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duced if today’s “spontaneous” infrastructure utilisation is replaced by an orderly, controllable 
operation via comprehensive application of ramp metering and freeway-to-freeway control, 
combined with powerful optimal control techniques. This chapter first explains why ramp me- 
tering can lead to a dramatic amelioration of traffic conditions on freeways. Subsequently, a 
large-scale example demonstrates the high potential of advanced ramp metering approaches. 
It is demonstrated that the proposed control scheme is efficient, fair and real-time feasible. 


15.1 Introduction 


Urban and interurban freeways had been originally conceived so as to provide 
virtually unlimited mobility to road users. The on-going dramatic expansion of car- 
ownership, however, has led to the daily appearance of recurrent and nonrecurrent 
freeway congestions of thousands of kilometres in length around the world. Iron- 
ically, daily recurrent congestions reduce substantially the available infrastructure 
capacity at the rush hours, i.e. at the time this capacity is most urgently needed, 
causing delays, increased environmental pollution, and reduced traffic safety. Simi- 
lar effects are observed in the frequent case of nonrecurrent congestions caused by 
incidents, road works, etc. It has been recently realized that the mere infrastructure 
expansion cannot provide a complete solution to these problems due to economic 
and environmental reasons or, in metropolitan areas, simply due to lack of space. 

The traffic situation on today’s freeways very much resembles the one in 
urban road networks prior to the introduction of traffic lights: blocked links, 
chaotic intersections, reduced safety. It seems like road authorities and road users 
are still chasing the phantom of unlimited mobility that freeways were originally 
supposed to provide. What is urgently needed, however, is to restore and maintain 
the full utilisation of the freeways’ capacity along with an orderly and balanced sat- 
isfaction of the occurring demand both during rush hours and in case of incidents. 
Clearly, the passage from chaotic to optimal traffic conditions is only possible if to- 
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day’s “spontaneous” use of the freeway infrastructure is replaced by suitable control 
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pi TRAFFIC . : 
demands exit flows 


d,(k) . NETWORK : s.(k) 
Fig. 15.1. A general traffic network. 


actions aiming at the benefit of all users. Ramp metering is the most efficient means 
to this end, whereby short delays at on-ramps and freeway-to-freeway intersections 
is the (relatively low) price to pay for capacity flow on the freeway itself, leading to 
substantial savings for each individual road user. 

This chapter first explains, based on simple arguments the reasons why ramp 
metering may lead to a substantial amelioration of traffic conditions on freeways 
(section 15.2). Then a hierarchical ramp metering control strategy based on a nonlin- 
ear optimal control problem formulation, is presented in section 15.3. A simulation 
example demonstrates the high amelioration potential of advanced ramp metering 
algorithms in section 15.4. Finally, section 15.5 summarizes the main conclusions. 


15.2 Why Ramp Metering? 


15.2.1 A basic property 


To be able to answer this question, we will first recall a simple fact. Consider any 
traffic network (Fig. 15.1) with demand appearing at several locations (e.g. at the on- 
ramps, in case of a freeway network) and exit flows forming at several destinations 
(e.g. at the freeway off-ramps). Clearly, the accumulated demand over, say, a day will 
be equal to the accumulated exit flows, because no vehicle disappears or is generated 
in the network. Let us assume that the demand level and its spatial and temporal 
distribution are independent of any control measures taken in the network. Then, we 
are interested to know how much accumulated time will be needed by all drivers 
to reach their respective destinations at the network exits (network efficiency!). It is 
quite evident that this total time spent by all drivers in the traffic network will be 
longer if, for any reason (e.g. due to lack of suitable control measures), the exit flows 
are temporarily lower, i.e. if vehicles are delayed within the network on their way to 
their destinations. As a consequence, any control measure or control strategy that can 
manage to increase the early exit flows of the network, will lead to a corresponding 
decrease of the total time spent. 

The statements above may be formalized by use of simple mathematics [9], [11]. 
For the needs of this chapter we will use a discrete-time representation of traffic vari- 
ables with discrete time index k = 0,1,2,... and time interval T. A traffic volume 
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or flow q(k) (in veh/h) is defined as the number of vehicles crossing a corresponding 
location during the time period [kT, (k + 1)T], divided by T. 

We consider a traffic network (Fig. 15.1) that receives demands d;(k) (in veh/h) 
at its origins 7 = 1, 2,... and we define the total demand d(k) = di(k)+d2(k)+---. 
We assume that d(k), k = 0,..., K—1, is independent of any control measures taken 
in the network. We define exit flows s;(k) at the network destinations i = 1,2,..., 
and the total exit flow s(k) = s1(k)+s2(k)+---. We wish to apply control measures 
so as to minimize the total time spent T, in the network over a time horizon K, i.e. 


T,= Ty. N(k) (15.1) 


where N (k) is the total number of vehicles in the network at time k. Due to conser- 
vation of vehicles 


N(k) = N(k — 1) +T|d(k — 1) — s(k — 1)]. (15.2) 


Substituting (15.2) in (15.1) we obtain 


K k—1 k—1 
Ts=TX_ |NO) +T d(s)-TY s(x)] . (15.3) 
k=1 K=0 K=0 


The first two terms in the outer sum of (15.3) are independent of the control measures 
taken in the network, hence minimization of T, is equivalent to maximization of the 
quantity 


K k-1 K-1 
S=T?X_ X sin) =T? X (K - k)s(k). (15.4) 
k=1 %=0 k=0 


Thus, minimization of the total time spent in a traffic network is equivalent to 
maximization of the time-weighted exit flows. In other words, the earlier the vehicles 
are able to exit the network (by appropriate use of the available control measures) the 
less time they will have spent in the network. 


15.2.2 First answer 


We consider (Fig. 15.2) two cases for a freeway on-ramp: (a) without and (b) 
with metering control. Let qin be the upstream freeway flow, d be the ramp demand, 
dcon be the mainstream outflow in presence of congestion, and qcap be the freeway 
capacity. It is well-known that the outflow qcon in case of congestion is lower by 
some 5-10% than the freeway capacity qcap. In Fig. 15.2(b), we assume that ramp 
metering may be used to maintain capacity flow on the mainstream, e.g. by using the 
control strategy ALINEA [13]. Of course, the application of ramp metering creates 
a queue at the on-ramp but, because qcap is greater than qcon (increased outflow!), 
ramp metering leads to a reduction of the total time spent (including the ramp waiting 
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Fig. 15.2. Two cases: (a) without and (b) with ramp metering; grey areas indicate congestion 
zones. 


time). It is easy to show [11] that the amelioration AT, (in %) of the total time spent 
is given by 
acap — fcon 
AT, = —“? teor 100. 15.5 

din + d— dcon ( ) 
As an example, if qin + d = 1.2qcap (i.e. the total demand exceeds the freeway 
capacity by 20%) and qeon = 0.95qcap (i.e. the capacity drop due to the congestion 
is 5%) then AT, = 20% results from (15.5), which demonstrates the importance of 
ramp metering. 


15.2.3 Second answer 


We consider (Fig. 15.3) two cases of a freeway stretch that includes an on-ramp 
and an off-ramp, namely (a) without and (b) with metering control. In order to clearly 
separate the different effects of ramp metering, we will assume here that qcon = dcap; 
i.e. no capacity drop due to congestion. Defining the exit rate y (0 < y < 1) as the 
portion of the upstream flow that exits at the off-ramp, it is easy to show [11] that the 
exit flow without control is given by 


ne Y 
s" = (dean — d) (15.6) 


while with metering control we have 
so = Y- qin- (15.7) 


Because (1 — Y)qin + d > qcap holds (else the congestion would not have been 
created), it follows that s”° is less than s’™, hence ramp metering increases the 
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Fig. 15.3. Two cases: (a) without and (b) with ramp metering. 


outflow thus decreasing the total time spent in the system. It is easy to show [11] that 
the amelioration of the total time spent in this case amounts to 


AT, = y : 100. (15.8) 


As an example, if the exit rate is y = 0.05 then the amelioration is AT, = 5%. 
If several upstream off-ramps are blocked by the congestion in absence of ramp 
metering (which is typically the case in many freeways during rush hours) then the 
amelioration achievable via introduction of ramp metering is accordingly higher. 
Summing up the effects of sections 15.2.2, 15.2.3 in a freeway network, over- 
all amelioration of total time spent by as much as 50% (i.e. halving of the average 
journey time) may readily result (see section 4). This may also be demonstrated via 
suitable treatment of real (congested) freeway traffic data as suggested by Varaiya 
and coworkers in [1], where it is estimated that the annual congestion delay of 70 
million veh-h on Los Angeles freeways could be reduced by 50 million veh-h if the 
highways were to be operated at 100% efficiency, e.g. via efficient ramp metering. 


15.2.4 Further impacts 


The road users choose their respective routes towards their destinations so as to 
minimize their individual travel times. When a control measure (e.g. ramp metering) 
is introduced that may change the delay experienced in particular network links (e.g. 
on-ramps), a portion of the drivers will accordingly change their usual route in order 
to benefit from, or avoid disbenefits due to the new network conditions. For example, 
in the case of Fig. 15.2(b), the upstream flow qin will probably increase while the 
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ramp demand d will decrease as compared to Fig. 15.2(a). Because the route choice 
behaviour of drivers is predictable to a large extent (Traffic Assignment problem!), 
ramp metering may also be used so as to impose an operationally desired traffic flow 
distribution in the overall network, e.g. avoidance of the rat-running phenomenon, 
increased or decreased utilisation of underutilised or overloaded, respectively, paral- 
lel arterials etc. Clearly, the modified routing behaviour of drivers should be taken 
into account in the design and evaluation phases of ramp metering control strategies. 

Several field evaluation results (see e.g. [7]) demonstrate that ramp metering im- 
proves the merging behaviour of traffic flow at freeway intersections which may have 
a significant positive impact on traffic safety due to fewer lane changes and reduced 
driver stress. Moreover, the increase of network efficiency related to both answers 
above, is expected to lead to accordingly improved network traffic safety and re- 
duced pollutant emissions to the environment. 


15.3 A Hierarchical Ramp Metering Strategy 


15.3.1 Traffic flow modeling 


A validated second-order traffic flow model is used for the description of traffic 
flow on freeway networks. The network is represented by a directed graph whereby 
the links of the graph represent freeway stretches. Each freeway stretch has uniform 
characteristics, i.e., no on-/off-ramps and no major changes in geometry. The nodes 
of the graph are placed at locations where a major change in road geometry occurs, 
as well as at junctions, on-ramps, and off-ramps. 

The time and space arguments are discretized. The discrete time step is denoted 
by T (typically T = 5...15 s). A freeway link m is divided into Nm segments of 
equal length Lm (typically Lm œ~ 500m), such that the stability condition Lm > 
T - Uf m holds, where vy, is the free-flow speed of link m. This condition ensures 
that no vehicle traveling with free speed will pass a segment during one simulation 
time step. Each segment i of link m at time t = kT, k = 0,..., K, where K is 
the time horizon, is macroscopically characterized via the following variables: the 
traffic density Pmi(k) (veh/lane-km) is the number of vehicles in segment 7 of link 
m at time t = kT divided by Lm and by the number of lanes Am; the mean speed 
Um,i(k) (km/h) is the mean speed of the vehicles included in segment i of link m 
at time kT; and the traffic volume or flow qm,;(k) (veh/h) is the number of vehicles 
leaving segment i of link m during the time period [kT, (k + 1)T], divided by T. The 
evolution of traffic state in each segment is described by use of two interconnected 
state equations for the density and mean speed, respectively, [2], [3], [4]. 

For origin links, i.e., links that receive traffic demand d, and forward it into the 
freeway network, a simple queue model is used (Fig. 15.4). The outflow q,(k) of an 
origin link is limited by a number of upper bounds; more specifically, g,(k) cannot 
be higher than: 


(i) the total present ramp demand d,(k) + wo(k)/T, where d,(k) is the arriving 
demand at period k and w,(k) (veh) is the current ramp queue; 
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Fig. 15.4. The origin-link queue model. 


Gi) the merging flow capacity Qo(k) which depends on the current density p,,1(k) 
of the merge segment; more specifically, Q,(k) is equal to the constant ramp 
flow capacity qo,max SO long as Pui (k) is less than a critical density p,,cr; if 
Pui(k) > Pucr then Qo(k) is linearly decreased with increasing p,,1(k) and 
reaches zero when Pu,i(k) attains a maximum value pmax- 


If a ramp is not metered, the corresponding outflow q,(k) obtains the lowest of 
both upper bounds above. In case of ramp metering, the outflow q,(k) is determined 
by the metering strategy but is eventually limited to the same bounds if necessary. 
Note that due to bound (ii), a ramp queue may be created even without ramp meter- 
ing, e.g. if the ramp flow capacity is reduced due to overcritical freeway density p,,1, 
i.e. due to mainstream congestion. The evolution of the origin queue we is described 
by an additional state equation (conservation of vehicles). Note that the freeway flow 
Qu,1(k) in merge segments attains a maximum value da1 if the corresponding den- 
sity ~,.,1(k) takes values near a critical density p;,cr- 

A similar queue-based approach applies to freeway-to-freeway (ftf) interchanges 
as well. 

Freeway bifurcations and junctions (including on-ramps and off-ramps) are rep- 
resented by nodes. Traffic enters a node n through a number of input links and is 
distributed to the output links. The percentage of the total inflow at a bifurcation 
node n that leaves via the outlink m is the turning rate 37", which can be easily 
estimated in real time. 


15.3.2 Local ramp metering strategies 


Most implemented ramp metering systems are based on local control strategies 
that address one single ramp at a time using traffic measurements from the vicinity 
of the ramp. A most successful local feedback ramp metering algorithm is ALINEA 
[12], [13] and its recent variations [15], [16]. ALINEA determines a ramp flow q7 (k) 
so as to maintain the traffic conditions in the merge segment (1, 1) close to a desired 
set value by use of an I-type regulator with a control sample time Te that is a multiple 
of the model time step T, i.e. Te = z-T, ze € N. 

If the set value J, ı concerns the merge segment density, we have the original 
ALINEA [12], [13] 
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q (Ke) = G5 (Ke = 1) +K, [Py = Pui (ke) (15.9) 


where ke = zck and K, is the feedback gain factor. If the set value q,,,1 concerns the 
merge segment outflow, we have the flow-based ALINEA [15] 


do(he) = qo(ke — 1) + Kp [Gna = 4u,1(ke)] (15.10) 


with Kp the feedback gain factor. Note that the same segment flow q,,1(k) may 
be present for either undercritical or overcritical (congested) density p,,1ı (k), hence 
utilisation of (15.10) is only recommended for set values q,,, that are well below the 
freeway capacity qu 1° [15]. 

Whichever regulator is used, the resulting flow q% (k) is bounded by the constant 
ramp flow 

capacity g™°* and a minimum admissible ramp flow g™™. In order to avoid wind- 
up, the term q? (ke — 1) used in both (15.9) and (15.10) is bounded accordingly. 

In order to avoid the creation of large ramp queues that would interfere with the 
surface street traffic, a queue control policy is employed in conjunction with every 
local metering strategy. The queue control law takes the form 


1 
do (ke) = -7 [Wo,max — w(ke)] + do(ke — 1) (15.11) 
where Wo max is the maximum admissible ramp queue.Thus, the final on-ramp out- 
flow is 


qo(ke) = max {q5 (ke), qo (ke) } (15.12) 


which means that queue control may override the ramp metering control whenever 
necessary to avoid overspilling of the ramp queue. 

Typically, ALINEA (15.9) is used with a set value py,.1 = Py,cr SO as to max- 
imise the mainstream flow q,,,1. However, if this is done in several metered ramps 
independently (no coordination), it may lead to an unbalanced utilisation of the avail- 
able ramp storage spaces, whereby mainstream congestion may not be avoided due 
to queue control overrides at some ramps. This is a main motivation for develop- 
ing coordinated network-wide ramp metering strategies that can exploit the available 
ramp storage spaces in an optimal way. 


15.3.3 Formulation of an optimal control problem for coordinated ramp 
metering 


The traffic flow model described in section 15.3.1 may be extended as follows to 
include the impact of ramp metering actions. If ramp metering is applied at a ramp o, 
the outflow qo(k) is a portion ro(k) of the flow that would leave the ramp in absence 
of ramp metering. Thus, ro(k) € [ro,min, 1] is the metering rate for origin link o, i.e. 
a control variable, where 19 min is a minimum admissible value while for ro(k) = 1 
no ramp metering is applied. 

The overall network traffic model has then the general state space form 
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x(k +1) =f [x(k),r(k), d(k)] (15.13) 


where the state of the traffic flow process is described by the state vector x € RN 
and its evolution depends on the system dynamics and the input variables. The input 
variables are distinguished into control variables r € R™ and uncontrollable external 
disturbances d € RP. In our case the state vector x consists of the densities Pm,i 
and mean speeds Vm, of every segment 7 of every link m, and the queues w, of 
every origin o. The control vector r consists of the ramp metering rates ro of every 
on-ramp o under control, with ro min < To(k) < 1. Finally, the disturbance vector 
consists of the demands d, at every origin of the network, and the turning rates 6% 
at the network’s bifurcations. The disturbance trajectories d(k) are assumed known 
over the time horizon Kp. For practical applications, these values may be predicted 
based on historical data and, if necessary, on real-time estimations, see [17]. 

The coordinated ramp metering control problem is formulated as a discrete-time 
dynamic optimal control problem with constrained control variables which can be 
solved numerically over a given optimization horizon Kp [14]. The chosen cost cri- 
terion aims at minimizing the Total Time Spent (TTS) of all vehicles in the network 
(including the waiting time experienced in the ramp queues). The minimization of 
TTS is a natural objective for the traffic systems considered here, as it represents the 
total time spent by all users in the network. Penalty terms are added appropriately 
to the cost criterion in order for the solution to comply with the maximum queue 
constraints. 

In [6] this nonlinear optimal control problem formulation combined with a pow- 
erful numerical optimization algorithm resulted in the AMOC open-loop control tool 
that is able to consider coordinated ramp metering, route guidance as well as inte- 
grated control combining both control measures. In [2], [3], [4] the results of AMOC 
application to the problem of coordinated ramp metering at the Amsterdam ring- 
road are presented in detail with special focus on the equity issue. The solution de- 
termined by AMOC consists of the optimal ramp metering rate trajectories and the 
corresponding optimal state trajectories. 

Due to various inherent uncertainties the open-loop optimal solution becomes 
suboptimal when directly applied to the freeway traffic process. In this chapter, the 
optimal results are cast in a model-predictive frame and are viewed as targets for 
local feedback regulators which leads to a hierarchical control structure similar to 
that proposed in [10], albeit with a more sophisticated optimal control approach. 


15.3.4 Hierarchical control 


The solution provided by AMOC is of an open-loop nature. As a consequence, 
its direct application may lead to traffic states different than the calculated optimal 
ones due to errors associated with the system’s initial state estimation x(0), with 
the prediction of the future disturbances d(k), k = 0,..., Kp — 1, with the model 
parameters based on which AMOC determines the optimal solution, as well as errors 
due to unpredictable incidents in the network. 


292 A. Kotsialos and M. Papageorgiou 


Historical data 


t 


State estimation 
> Prediction of Estimation/Prediction Layer 


Measurements disturbances 











Current state estimation 
Predicted disturbance trajectories 


v 





Optimal control 


Optimization Layer 
(AMOC) 











Open-loop optimal solution 
Optimal state trajectory 
























































@ 
| y v 
Local Local Local Direct Control 
regulator regulator regulator Layer 
A 
COMPUTER 
REAL WORLD 
vy 
. Total 
Freeway network traffic flow process TE 
pen 











i 7 


On-ramp Weather incidents Routing 
demand conditions | behaviour 


Fig. 15.5. Hierarchical control structure. 


Since estimation, modeling and prediction errors are inevitable, a receding hori- 
zon approach (model-predictive control) is employed to address any mismatch be- 
tween the predicted and actual system behavior. This approach is suitably extended 
to the hierarchical control system depicted in Fig. 15.5, which consists of three lay- 
ers. 

The Estimation/Prediction Layer receives as input historical data, information 
about incidents and real-time measurements from sensors installed in the freeway 
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network. This information is processed in order to provide the current state estima- 
tion and future predictions of the disturbances to the next layer. 

The Optimization Layer (AMOC) considers the current time as t = 0 and uses 
the current state estimate as initial condition x9. Given the predictions d(k), k = 
0,..., Kp — 1, the optimal control problem is solved delivering the optimal control 
trajectory (translated into optimal on-ramp outflows) and the corresponding optimal 
state trajectory. These trajectories are forwarded as set values to the decentralized 
Direct Control Layer, that has the task of realizing the suggested policy. 

For each on-ramp o with merging segment (u, 1) (Fig. 15.4) a local regulator is 
applied with control sample time T, = zT, Ze € N, in order to calculate the on- 
ramp outflow q7 (ke), where ke = ze- k. We define the average quantities py, 4 (ke) = 
yo Pr,1(2)/2 and GF (ke) = yo qi 1(2)/Zc» Where the *-index denotes 
optimal values resulting from AMOC. 

We distinguish two cases for later comparison. In the first case, the optimal con- 
trol trajectories are directly applied to the traffic process, i.e. 


qo (Ke) = Gia (ke). (15.14) 


This is followed by the queue control override (15.11), (15.12). 

In the second case, the Direct Control Layer is actually introduced. More specif- 
ically, the regulators ALINEA and flow-based ALINEA ([12], [15]) are employed as 
local regulators, while the optimal state trajectory is used to determine the set-points 
for each particular on-ramp. 

The flows q/,, are preferable as set-points for local regulation because they are 
directly measurable without the uncertainty caused by modelling. However, flows 
do not uniquely characterize the traffic state, as the same flow may be encountered 
under non-congested or congested traffic conditions. Hence a flow set-point qu, = 
T; 1 (ke) is used (in conjunction with flow-based ALINEA), only if 7, 4 (Ke) < Py,er 
and G1 (Ke) < 0.9dy,cap, i.e. only if the optimal flows are well below the critical 
traffic conditions. If pj, (ke) = Per, then ALINEA is applied during the period ke 
with set-point py1 = Pj,,1(kc). In any other case, ALINEA is applied with 6,1 = 
Pu,cr SO as to guarantee maximum flow even in presence of various mismatches. 

The update period or application horizon of the model-predictive control is 
Ka < Kp, after which the optimal control problem is solved again with updated 
state estimation and the disturbance predictions, thereby closing the control loop of 
AMOC as in model-predictive control. The control actions will be generally more 
efficient with increasing K p and decreasing K 4. 


15.4 Simulation Results 


15.4.1 The Amsterdam network 


For the purposes of our study, the counter-clockwise direction of the A10 free- 
way, which is about 32 km long, is considered. There are 21 on-ramps on this free- 
way, including the junctions with the A8, A4, A2, and Al freeways, and 20 off- 
ramps, including the connections with A4, A2, Al, and A8. The topological network 
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Fig. 15.6. The Amsterdam ring-road model. 


model may be seen in Fig. 15.6. It is assumed that ramp metering may be performed 
at all on-ramps. The model parameters for this network were determined from vali- 
dation of the network traffic flow model against real data [5]. 

The ring-road was divided in 76 segments with average length 421m. This means 
that the state vector is 173-dimensional (including the 21 on-ramp queues). With 
ramp metering applied to all on-ramps, the control vector is 21-dimensional, while 
the disturbance vector is 41-dimensional. 

The network traffic model described in section 15.3.1 is available as a macro- 
scopic simulator METANET [8] to be used for simulation purposes. This means that 
the same model is used here for both the optimal control AMOC and the simulator 
METANET, albeit under some 

mismatch conditions detailed later. 


15.4.2 The no-control case 


Using real (measured) time-dependent demand and turning rate trajectories as in- 
put to METANET for the evening peak period 16:00-20:00 p.m. without any control 
measures, heavy congestion appears in the freeway and large queues are built in the 
on-ramps. The density evolution profile is displayed in Fig. 15.7 and the correspond- 
ing queue evolution profile in Fig. 15.8. The excessive demand coupled with the 
uncontrolled entrance of drivers into the mainstream causes congestion (Fig. 15.7). 
This congestion originates at the junction of Al with A10 and propagates upstream 
blocking the A4 and a large part of the A10-West. As a result many vehicles are ac- 
cumulated in the ftf on-ramp of A4 (i.e. we have a spillback of the congestion onto 
the A4 freeway) and in the surrounding on-ramps (Fig. 15.8). The TTS for this sce- 
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Fig. 15.7. No-control scenario: Density evolution profile. 
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Fig. 15.8. No-control scenario: Ramp queue evolution profile. 


nario is equal to 14,167 veh-h. The described simulated traffic conditions correspond 
pretty accurately to the real uncontrolled traffic conditions in this network during the 
evening peak period. 


15.4.3 Application of ALINEA 


In this section the application of the ALINEA strategy (15.9) to all on-ramps is 
examined. ALINEA is used as a stand-alone strategy for each on-ramp without any 
kind of coordination. The set-point for each on-ramp o is set equal to the critical 
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Fig. 15.9. ALINEA control: Ramp queue profile without queue constraints. 


density of the corresponding link p, i.e. Py, = Per,y,, SO as to maximize the local 
freeway throughput. Two cases are considered with respect to the presence or not of 
the maximum queue constraint in the sense of (15.12). In case the maximum queue 
constraints are active, we will assume that the maximum queue length for the urban 
on-ramps is 100 veh and for the ftf ramps 200 veh. Furthermore, we assume that 
there is no re-routing of the drivers towards the surrounding urban network when 
they are confronted with large queues at the on-ramps. 

The application of ALINEA without queue constraints leads to a significant ame- 
lioration of the traffic conditions and the TTS is reduced to 7,924 veh-h, which is an 
improvement of 44% compared to the no-control case. The critical point, however, 
is in the queue evolution profile, where it may be seen (Fig. 15.9) that a huge queue 
is formed at the Al ftf ramp, that actually prevents Al’s demand from triggering 
the congestion at the junction of Al with A10. Clearly, the large Al ramp queue is 
not acceptable because it incures excessive delays to the corresponding ramp users 
(albeit to the strong benefit of the rest of the driver population). 

When maximum queue constraints are considered in the sense of (15.12), the 
application of ALINEA becomes less efficient and the resulting TTS equals 10,478 
veh-h, a 26% improvement over the no-control case. The reduction of the strategy’s 
efficiency is due to the fact that the creation of the large queue in the A1 ftf ramp is 
not allowed any more, hence a congestion is created there, is propagating unpstream 
and triggers ALINEA action in further upstream ramps (Fig. 15.10). 


15.4.4 Application of hierarchical control 


First the optimal open-loop solution under the assumption of perfect information 
with respect to the future disturbances for the entire simulation time is considered. 
This solution serves as an “upper bound” for the efficiency of the control strategy as 
it relies on ideal conditions. The TTS in this case becomes 6,974 veh-h, which is a 
50.8% improvement over the no-control case. 
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Fig. 15.10. ALINEA control: Ramp queue profile with queue constraints. 


As mentioned in section 15.3.4, however, the results obtained by the optimal 
open-loop control are not realistic because the assumption of perfect knowledge of 
the future disturbances cannot hold in practice. The hierarchical control proposed is 
able to cope with this problem by employing the rolling horizon technique. For its 
application we will use Kp = 360 (1 hour) and K4 = 60 (10 min). For the purposes 
of this control scenario, it is assumed that the state of the system is known exactly 
when AMOC is applied every 10 minutes, which is a fairly realistic assumption. 

With respect to the on-ramp demands, we assume that a fairly good predictor is 
available. Fig. 15.11 depicts an example of the actual and predicted demand, for the 
ftf on-ramp A8. The actual demand is input to the simulator METANET while the 
predicted trajectory is input to AMOC. With respect to the prediction of the turning 
rates, it is possible, based on historical data, to find a mean value for every turning 
rate for the considered time period. Thus, while METANET considers the real time- 
dependent turning rates, AMOC uses the average turning rates. Finally, we assume 
that there is no mismatch between the model parameters used by METANET and 
the corresponding parameters used by AMOC and that there are no incidents in the 
network. 

As mentioned in section 15.3.4, there are two cases for the application of AMOC 
results. In the first case, the optimal ramp flows calculated by AMOC are directly 
applied to the traffic flow process (with ramp queue override when necessary). In the 
second case, the ALINEA and flow-based ALINEA strategies are employed. In the 
first case the TTS becomes equal to 8,267 veh-h, which is a 41.6% improvement over 
the no-control case and 18.5% worsening compared to the optimal open-loop control. 
When ALINEA is used at the direct control layer, the TTS becomes equal to 8,086 
veh-h, which is a 42.9% improvement over the no-control case and 15.9% larger than 
the TTS of the optimal open-loop control. The density and queue evolution profiles of 
the second case, are depicted in Figs. 15.12 and 15.13, respectively. The TTS differ- 
ence between the direct application of AMOC results and the ALINEA employment 
as a Direct Control Layer is minor in this example; on-going investigations indicate 
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Fig. 15.12. Hierarchical control with ALINEA: Density profile. 


large improvements in case of more significant mismatch between AMOC and the 
METANET simulator. 

Comparing the on-ramp queue evolution profile of Fig. 15.13 with the corre- 
sponding profile in the case of ALINEA with queue control (Fig. 15.10), the differ- 
ence between both control strategies becomes apparent. In the ALINEA case, queues 
are built in the second half of the simulation horizon, in reaction to the congestion 
that has been formed. In the hierarchical control case the queues are built early in the 
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Fig. 15.13. Hierarchical control with ALINEA: Ramp queue profile. 


simulation time in anticipation of the future congestion. Furthermore, this is done 
in such a manner that the maximum queue constraints are taken into consideration 
without serious degradation of the strategy’s efficiency. 


15.4.5 Equity 


The maximum queue constraints may also be used to implicitly address the prob- 
lem of equity [3]. Fig. 15.14 depicts the average time spent by a vehicle in the ramp 
queue plus traveling 6.5 km downstream on the freeway, for the no-control case and 
the three control scenarios considered. It can be seen that in the no-control case 
the mean travel time is large at the A10-West ramps as a direct consequence of the 
created congestion. Without queue control ALINEA reduces the mean time for all 
on-ramps but for Al, where a large peak appears due to the extended delays in the 
on-ramp queue (Fig. 15.9). The introduction of the queue constraints for ALINEA 
reduces the mean travel time at A1 but leads to significant travel time increases in 
other upstream on-ramps of A10-South and A10-West due to mainstream conges- 
tion. Clearly this is not a fair distribution of the ramp delays required for the ame- 
lioration of the traffic conditions. In the case of the hierarchical control strategy, the 
travel times for virtually all on-ramps are significantly lower than for no-control or 
ALINEA with queue constraints. The high peaks in Al and A2 are not present any- 
more at the expense of a relatively low increase of the travel times of the on-ramps 
upstream of Al compared to the case of ALINEA without queue constraints. The 
hierarchical controller’s distribution of the delays is performed in a more balanced 
way which is more equitable for the drivers, especially those of Al and A2. Thus, 
the proposed hierarchical control leads to a substantial amelioration of the TTS of 
the whole driver population (efficiency) by improving the travel times of drivers for 
virtually every individual on-ramp (equity) compared to the no-control case. Since 
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Fig. 15.14. Average travel time for queuing and traveling 6.5 km downstream for every on- 
ramp. 


travel times for all ramps are reduced, the control scheme corresponds to a win-win 
situation and every driver should be happy with it. 


15.4.6 Computation time 


In order for the hierarchical control to be applied in the field, the computation 
time needed for the numerical solution of the associated optimal control problem 
at each application must be sufficiently low for the real-time application of this ap- 
proach to be feasible. The required CPU-time varies from application to application, 
but generally the algorithm converges very fast to an optimal solution within a few 
CPU-seconds (1MHz P3 processor with Linux), which proves that the real-time ap- 
plication of the control strategy in the field is feasible even for application periods 
much shorter than the 10 min employed here. 


15.5 Conclusions 
Modern freeway network capacity is daily underutilized, particularly during rush 
hours and at the occurrence of incidents, i.e. when it is most urgently needed, due to: 


e reduced congestion outflow (see section 15.2.2) 
e reduced off-ramp flow (see section 15.2.3) 
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e uncontrolled flow distribution in the overall network (see section 15.2.4). 


The introduction of ramp metering at some particular ramps or particular freeway 
stretches within the overall network can help to reduce some local traffic problems 
and to improve the local traffic conditions. However, the significant amelioration of 
the global traffic conditions in the overall traffic network calls for comprehensive 
control of all or most of the ramps, including the freeway-to-freeway links, in the 
aim of optimal utilisation of the available infrastructure. The limitations of partial 
(rather than comprehensive) ramp metering are: 


1. The potential benefits of partial ramp metering (according to Figs. 15.2, 15.3) 
may be counterbalanced to some extent by a modified route choice behaviour of 
drivers who attempt the minimisation of their individual travel times under the 
new conditions. 

2. Individual on-ramps have a limited storage capacity for waiting vehicles; if the 
on-ramp queue reaches back to the surface street junction, ramp metering control 
is typically released in order to avoid interference with surface street traffic and 
mainstream congestion cannot be avoided. 

3. The freeway network is a common resource for many driver groups with dif- 
ferent origins and destinations. Partial ramp metering, by its nature, does not 
address the strategic problem of optimal utilisation of the overall infrastructure, 
nor does it guarantee a fair and orderly capacity allocation among the ramps. 


Comprehensive ramp metering, on the other hand, does not suffer from these 
shortcomings, first because of complete control of the network traffic flow and its 
spatial and temporal distribution, and second because of sufficient available storage 
capacity. In fact, one or a few particular ramps located at a critical bottleneck area 
may not have sufficient storage capacity to completely avoid the building up of a con- 
gestion. However, in case of comprehensive optimal ramp metering in the sense of 
section 15.3.3, the total available storage space in all ramps and freeway intersections 
is usually sufficient to effectively and ultimately combat freeway congestion. 

It should be emphasized that the implementation and operation cost of a com- 
prehensive ramp metering system is estimated to be rather low as compared to the 
corresponding infrastructure cost and to the expected benefits in terms of dramati- 
cally reduced delays, increased traffic safety, and reduced environmental pollution. 
It should also be noted that the advanced methodological tools required for efficient 
operation of such a comprehensive ramp metering system are currently available, 
see section 15.3. The major problem to overcome today, is the inertia of political 
decision-makers which, on its turn, is mainly due to the lack of understanding of the 
huge potential of comprehensive ramp metering systems. 

We believe that freeway networks will have to be operated as completely 
controllable systems in the near future, similar to the urban traffic networks, 
because this is the smartest way to avoid further degradation and even fatal 
gridlocks. The sooner this is realized by the road authorities, the better for the 
road users who will be the major beneficiaries of this evolution. We would like 
to acknowledge the valuable research [1] and further manifold actions of Pravin 
Varaiya towards this end. 
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16.1 Introduction 


Intelligent Transportation Systems (ITS) represent a natural convergence of many 
of the technologies, concepts, and problem domains in which Varaiya has made sem- 
inal contributions over his distinguished career to date. The overarching rationale 
of ITS is that developments in sensing, location, information and communication 
technologies can be put to effective use in improving the performance of transporta- 
tion systems and facilities. Inherently, transportation and communication systems 
bear many similarities: both are complex dynamic spatial systems, organized around 
hierarchical network structures, built to deliver services that meet critical human 
needs. Both carry flows that vary dynamically, with varying degrees of predictabil- 
ity, from origins to destinations, and require control architectures and operational 
rules to avoid conflict and enable flows to reach their destinations in a manner that 
maximizes efficient utilization of resources. The main differences lie in the underly- 
ing physics: in transportation systems, an added source of complexity is that human 
beings are the primary agents determining the behavior of the individual particles 
(vehicles) interacting nonlinearly through the network components. Traffic scientists 
and physicists have long recognized that this interaction produces collective effects 
that present both predictable patterns as well as sometimes volatile properties, which 
greatly affect the resulting performance of these systems, and their ability to meet 
users’ needs and expectations for safe and reliable travel [18]. 

The interplay between the individual behavior of particles and the collective 
properties of the system has played a central role in the development of the field 
of traffic science, and its applications to engineering practice. While advances in 
traffic science over the past half century have resulted in some well understood and 
reasonably predictable phenomena, many vexing questions remain, especially with 
regard to the behavior and properties of systems under high levels of demand rel- 
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ative to the available service resources (infrastructure). The resulting congestion is 
accompanied by a high degree of inefficiency, as the nature of human interactions 
in dense environments results in considerable service degradation and reduction in 
service rates (throughput) of the impacted facilities. 

At the root of this inefficiency are the cognitive and behavioral limitations of 
human drivers. Microscopic characteristics at the individual driver level, such as per- 
ception time lags, reaction times, and a natural tendency towards over-reaction under 
stressful situations or perceived risk, result in volatility, congestion, instability, frus- 
trating stop-and-go patterns, capacity loss, and other component and system level 
macroscopic phenomena. Eliminate or reduce individual human error, and the sys- 
tem will operate more efficiently. Monitor the state of the system at all times, and it 
would be possible to intervene and apply control actions in real-time to best utilize 
available resources. These two realizations have motivated the two main develop- 
ment directions for Intelligent Transportation Systems. Varaiya’s substantial contri- 
butions to the intellectual, theoretical, methodological and, increasingly, professional 
practice dimensions of ITS development address both of these opportunity targets. 

In the first area, Varaiya realized that by eliminating or minimizing the active role 
of the human being in the driving process, through automated control systems that 
rely on precise measurements of neighbor vehicle properties as well as accurate rep- 
resentation of the surrounding environment (e.g. critical roadway design features), 
one would considerably increase efficiency and reliability. Such systems have been 
called different things at different times, including automatic vehicle control sys- 
tems (AVCS) as well as simply automated highway systems (AHS). Varaiya’s con- 
tributions to this problem form the core scientific underpinnings for the operational 
analysis and system design of AHS. They provided a much needed access ramp for 
engineers and researchers from various disciplines, such as control theory and traffic 
science, to address these problems. In a seminal sole-authored paper [32] (Varaiya, 
1993), as well as in a collaborative paper with Hedrick and Tomizuka [17], Varaiya 
contributed field-defining works that articulated the principal control issues of AHS 
(see also [19]). Varaiya and his students also produced several seminal contributions 
to traffic flow modeling under AHS operational rules, as well as to the formulation 
of rules and protocols for insuring safe maneuvers in a mixed traffic environment, 
i.e. one in which automated vehicles share the right of way with human-controlled 
vehicles [13], [29], [4]. 

In the second area, Varaiya acted on the dual realization, one technical and the 
other professional/institutional, that (1) sensing prevailing conditions in traffic net- 
works plays a central role as a basis for “intelligence” in transportation systems, 
especially freeways; and (2) existing sensor systems already deployed are only deliv- 
ering a small fraction of their potential value to the owner agencies because of absent 
or arcane decision support tools that could enable traffic managers to query the rich 
database of accumulating traffic information. This gave rise to PeMS, a general Per- 
formance Measurement System, initially applied to California Freeways [11],[33], 
which has become a model showcase of how to effectively leverage the massive 
amount of sensor data collected on the transportation system. Built around an ele- 
gant software design that retains considerable simplicity and ease of use (via simple 
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web browser), yet enables efficient data query and analysis, PeMS is a framework 
that allows the ITS community to leverage the investment in sensors and sensor data 
to support a wide range of operational and planning uses by both practitioners and 
researchers. It also makes all data available online, providing an excellent resource 
for researchers. 

Applications of the PeMS data bases are numerous, and the potential applica- 
tions cover a range of operations and planning issues; some examples include travel 
time reliability assessment, systematic identification of freeway bottlenecks, and ex- 
ploration of the causes and cures of traffic congestion [10], [8], [9]. Additional work 
on sensors and traffic sensor data interpretation [23] is continuing, and is now target- 
ing the development of relatively low-cost wireless sensor networks for widespread 
deployment of traffic sensing capabilities. 

While the emergence of modern ITS ideas can be traced back to the late 1980’s, 
early ideas dating back to the 1960’s had been articulated by control theorists and 
traffic systems engineers. As is often the case with visionary technology applications, 
the technologies themselves (e.g. in wireless communications, wireless-assisted GPS 
location, mapping and geographic information systems) have already advanced well 
beyond the initial vision, though the deployment and institutional adoption of the 
overall systems have remained far short of the original designs. What is remarkable 
about Varaiya’s contributions to ITS is that they span the whole realm of underly- 
ing enabling technologies, to specific control structures and rules for the application 
of these technologies, as well as, more recently, decision support platforms for the 
delivery of mission-critical applications to user agencies that allow them to exploit 
the potential of their investment in sensing and monitoring infrastructure. With the 
data comes knowledge and with knowledge comes power—these investments in data 
analysis tools are already beginning to generate important payoff in terns of knowl- 
edge about fundamental properties of the systems under observation. 

An important use of the PeMS data is the role it can play in the development, 
calibration, validation and operational deployment of advanced traffic network anal- 
ysis tools, such as network assignment-simulation models for real-time estimation 
and prediction of network states to support development of online routing and con- 
trol strategies. The rest of the chapter describes such an application conducted by the 
authors at the University of Maryland, for which PeMS data for Orange County in 
California played an important role. The chapter focuses specifically on the frame- 
work devised to estimate and predict dynamic origin-destination (OD) trip demand 
patterns, and update these patterns from one period (one day) to the next. The pre- 
dicted OD demand serves as input to a network simulation capability that incorpo- 
rates users’ responses to supplied traffic information and control actions. 

The rest of the chapter is organized as follows. Following the motivation in the 
next section, a structural state space model for real-time OD estimation and pre- 
diction is presented, within a rolling horizon execution framework in connection 
with real-time dynamic traffic assignment simulators. By considering demand de- 
viations from the a priori estimate of the regular pattern as a time-varying process 
with smooth trend, a polynomial trend filter is developed as the core model to cap- 
ture possible structural deviations in real-time demand. In Section 16.4, a Kalman 
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filter formulation and the corresponding optimal updating algorithms are presented 
to keep track of the up-to-date regular demand pattern using real-time information. 
Section 16.5 describes the application results of the proposed models and algorithms 
using real-world PeMS data for the Irvine, Orange County network. 


16.2 Motivation and Background 


The premise of ITS is the ability to sense prevailing conditions and rapidly devise 
actions to optimize system performance in real-time. Because the dynamics of traffic 
systems are complex, as they depend on the interaction of many independent agents 
(drivers) acting non-cooperatively in a spatially connected network, many situations 
call for strategies that anticipate unfolding conditions instead of adopting a purely 
reactive approach. Real-time simulation of the traffic network forms the basis of 
a state prediction capability that fuses historical data with sensor information, and 
uses a description of how traffic behaves in networks to predict future conditions, 
and accordingly develop control measures. Because these actions are predicated on 
network conditions, which in turn depend on the users’ decisions, network states 
have to be determined simultaneously with the tripmaker choices, generally in an 
iterative scheme. The estimated state of the network and predicted future states, in 
terms of flows, travel times and other time-varying performance characteristics on 
the various components of the network, are used in the on-line generation and real- 
time evaluation of a wide range of measures, including information supply to users. 
The core of the descriptive DTA capability is a traffic simulation model, intended to 
capture the dynamics of traffic flow movement in the network [24], [20] [26],[25]. 

The two capabilities above (descriptive and normative), along with their sup- 
port functions, are integrated in the DYNASMART-X DTA System, to provide, in 
real-time: (1) estimates of network traffic conditions, (2) predictions of network flow 
patterns over the near and medium terms in response to various contemplated traf- 
fic control measures and information dissemination strategies, and (3) routing in- 
formation to guide trip-makers in their travel. The system includes several functional 
modules (for OD estimation, OD prediction, real-time network state simulation, con- 
sistency checking, updating and resetting functions, and network state prediction), 
integrated through a flexible distributed design that uses CORBA (Common Object 
Request Broker Architecture) standards, for real-time operation in a rolling horizon 
framework with multiple asynchronous horizons for the various modules. 

Dynamic origin destination (OD) demand estimation and prediction is an im- 
portant capability in its own right, and an essential support function for real-time 
dynamic traffic assignment (DTA) model systems for ITS applications. The dynamic 
OD demand estimation and prediction problem seeks to estimate time-dependent OD 
trip demand patterns at the current stage, and predict demand volumes over the near 
and medium terms in a general network, given historical demand information and 
real-world traffic measurements from various surveillance devices (e.g. occupancy 
and volume observations from loop detectors on specific links). 
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Substantial research efforts have been devoted to dynamic demand estimation 
and prediction problems over the past 20 years. Existing models can be grouped into 
two classes: DTA based vs. non-DTA based, depending on whether a DTA compo- 
nent is incorporated into the estimation process [22],[6],[28]. In this chapter, existing 
models are categorized according to the underlying assumptions in representing dy- 
namic demand processes. Assuming that the deviations of flow (demand) from his- 
torical averages define a stationary time series, the first group applies auto-regressive 
(AR) models to the recursive estimation and prediction process. In the Kalman filter- 
ing formulation proposed by Okutani and Stephanedes [27], the original data is first 
detrended from historical observations, then an AR model is used to estimate and 
forecast time-varying traffic flows on a single link. Along the same line, Ashok and 
Ben-Akiva [2],[3] formulated deviations of OD demand from historical averages as 
AR processes, and further developed a Kalman filter for real-time OD demand esti- 
mation and prediction, in which a 4*”-order AR model is adopted based on several 
data sets. In general, an autoregressive model is suitable to describe a stationary ran- 
dom process with constant mean and variance. On the other hand, if the prevailing 
OD demand is structurally different from the regular demand pattern, demand devia- 
tions will not satisfy the fundamental stationarity assumption for AR processes, and 
such non-stationarity could seriously degrade the overall prediction performance. In 
addition, an AR type model with high-order terms requires extensive off-line cali- 
bration effort for the autocorrelation coefficients, and the corresponding augmented 
state space also dramatically increases the on-line computational burden, especially 
for large-scale network applications. 

Alternatively, without requiring prior demand information, a simple random walk 
model can be relatively easily built for short-term demand prediction, corresponding 
to an AR(1) model with autocorrelation coefficient of 1. Cremer and Keller [14],[15], 
as well as Chang and Wu [7] applied the random walk model to predict dynamic 
OD flow split parameters, by directly extending the latest estimates as the future 
forecasts. Although this model is effective for a slowly changing process, it might 
not be rich enough to capture non-linear trends in time-varying OD flows, especially 
for medium term prediction. In order to describe the non-linearity in dynamic OD 
demand, Kang [22] and Mahmassani et al. [26] proposed a polynomial trend filter to 
estimate time-dependent OD flows on a general network, using historical information 
to calibrate demand evolution processes. 

In a closely related problem area, approaches for off-line time-varying OD de- 
mand estimation have also been proposed in the past decade, mostly for operational 
planning applications. Using a simplified assignment model, Cascetta et al. [5] pre- 
sented a generalized least squares framework for estimating time-varying demand in 
a network. A bi-level DTA-based time-varying demand estimation formulation was 
introduced by Tavana and Mahmassani [31] and further extended by Zhou et al. [35] 
to utilize multi-day link counts. In contrast, little attention has been given to proce- 
dures for effectively and systematically updating the historical demand information 
for on-line estimation and prediction purposes. Ashok [1] suggested several heuristic 
approaches to update the historical demand estimate with recent estimates obtained 
in real-time, but no optimal updating formulation was given. 
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In general, regular OD trip desires can be viewed as a repeated process with sim- 
ilar within-day dynamic patterns. By utilizing knowledge from household interview 
surveys and off-line estimation results on multiple days, historical demand data rep- 
resents the a priori estimate of the regular OD demand pattern. In particular, in the 
context of long-range demand prediction, reliable historical data can serve as an in- 
formative source under normal conditions. On the other hand, it is necessary to recog- 
nize the possible existence of structural deviations of real-time OD demand from the 
average pattern; these might be caused by severe weather conditions, special events, 
as well as the responses of travelers to information and/or other system management 
measures. The first two factors have been well recognized as critical determinants in 
the effectiveness of travel demand management systems. With increasing availabil- 
ity, traveler information, particularly, pre-trip information, is expected to play a more 
active role in gradually changing day-to-day trip-making decisions and the resulting 
temporal distributions of OD demand. In addition, random fluctuations would still 
account for the effect of other unobserved factors and the inherent stochastic nature 
of daily time-varying demand. 

In the early deployment of real-time OD estimation and prediction, a common 
issue is that only unreliable historical demand data with significant uncertainty is 
available, often consisting of out-of-date survey data and limited surveillance data. 
In this case, as the prior estimate cannot adequately describe the average conditions, 
the real-time estimate becomes more informative in the sense that it captures the 
prevailing demand pattern and encapsulates up-to-date demand information. 

To provide accurate and robust demand estimation and prediction for real-time 
dynamic traffic assignment in operational settings, the following primary functional 
requirements need to be satisfied: (1) incorporate regular demand information into 
the real-time demand prediction process; (2) recognize and capture possible struc- 
tural changes in demand patterns under various conditions; and (3) optimally update 
the a priori estimate of the regular pattern using new real-time estimation results and 
traffic observations. 

In this work, actual dynamic OD demand is decomposed to three meaningful 
components in a structural state space model, namely, 

true demand = regular pattern + structural deviations + random fluctuations. 

The next section first describes a rolling horizon execution framework for real- 
time OD estimation and prediction in connection with real-time DTA simulators, 
followed by the introduction of a structural state space model for real-time OD esti- 
mation and prediction. By considering demand deviations from the a priori estimate 
of the regular pattern as a time-varying process with smooth trend, a polynomial 
trend filter is developed as the core model to capture possible structural deviations in 
real-time demand. 
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16.3 Structural Model for Real-Time OD Estimation and 
Prediction 


The rolling horizon framework in this chapter follows the system design of a 
real-time dynamic traffic assignment system [25],[31]. The scheme entails sequen- 
tial execution of the OD estimator and predictor, in conjunction with real-time DTA 
simulators. As shown in Figure 16.1, the prediction (or planning) horizon represents 
the time length for which forecasted OD demand should be available for the DTA 
simulator. The prediction horizon starts at the end of a roll period, which is the time 
shift between the respective beginning of consecutive prediction horizons. Predic- 
tions for a given period are based on the estimation results obtained during the roll 
period, using observations streaming in real-time over a certain observation period. 
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Fig. 16.1. Illustration of rolling horizon implementation 


In the approach above, a thorny modeling issue in OD estimation is how to han- 
dle lagged OD demand on current link observations. This issue arises because each 
traveler takes a certain time to complete his/her trip in a large city network, and the 
resulting travel time can be very long depending on trip length and prevailing traffic 
conditions. Failure to recognize the existence of lagged demand would attribute all 
current flows to demands departing during the current estimation stage, potentially 
leading to serious bias in estimation results. One possible solution is to extend the 
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dimension of the state variable vector so as to include all the lagged OD demand 
variables in the current estimation stage [27], but the resulting expanded state space 
could significantly increase the computational complexity. The proposed polynomial 
trend model offers a compact representation of lagged demands, as described in a 
later section. 

The rolling horizon implementation of real-time OD estimation and prediction is 
stated as Algorithm 1. This approach integrates real-time OD estimation and predic- 
tion with other on-line DTA components; specifically, the DTA simulator is relied 
upon to generate link proportions for the OD estimation module at the current stage, 
and OD prediction provides future OD demands for the assignment and simulation 
in the next stage. 


16.3.1 Algorithm 1. Rolling horizon implementation for real-time OD 
estimation and prediction 


Step 1: Receive real-time traffic measurements from surveillance system. 
Step 2: Fetch link proportion data for the current estimation stage from the DTA 
simulator. 
Step 3: (OD estimation) Estimate time-varying OD demand matrices involved in the 
current estimation stage using the Kalman filtering method. 
Step 4: (OD prediction) Predict OD demand over next future horizon. 
Step 5: Advance roll period forward, and then go back to Step 1. 

For convenient reference, the notation used in the real-time OD estimation and 
prediction model is first presented, as follows: 

i = index for links with traffic measurements, i = 1,..., Nobs. 

j = index for origin-destination pairs, 7 = 1,..., Noa 

T = index for aggregated departure time intervals, 7 = 1,2,... 

t = index for observation time interval, i.e. sampling time interval, t = 1,2,... 

k = index for stage period, k = 1,2,3,... 

n = number of observation intervals per departure time interval 

l = number of departure time intervals per roll period 

h = prediction horizon in numbers of departure time intervals 

q = maximum lag length in numbers of departure time intervals, i.e. the traffic 
flow at the current departure time interval 7 can include traffic demand departing 
from interval 7, T — 1,7 —2,...,7-—q 

C(i,t) = number of vehicles measured on link 7, during observation interval t 

Dij) = demand volume from origin-destination pair j during departure time 
interval 7 

LP«G,t),g,r) = link proportions, that is the proportion of vehicles on link 7 at 
observation time ¢ (coming from OD pair j at departure time 7) to the total demand 
of OD pair j at departure time T 

D = demand volume in regular demand pattern for origin-destination pair j 

during departure time interval T 

Din) = a priori estimate of regular demand volume for origin-destination pair 
j during departure time interval 7 
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H(j,r) = structural demand deviation of from a priori estimate 


Die) for OD pair j with departure time T 

E(j,r) = error term in approximating true demand for OD pair j with departure 
time T 

Hy jr)? HG) HG) = first, second and third-order derivatives of demand 
deviation H(j,r), respectively 


p = order index of a polynomial model 


a R = p*”-order derivative of demand deviation U(r) 


m = maximum order of a polynomial model 
ma = evolution noise for pt”-order derivative of demand deviation Hijr) 
Uc,z) = combined error term in the estimation of link observation c;;,;) due to in- 
consistencies in assumptions about traffic assignment, traffic control and flow prop- 
agation, as well as measurement noise 


V(t) = combined error term due to uç; +) and €(;,,) for link observation c(; 4) 


Dij) = estimated mean value of Di;,,) 


a(o) acti (p) 
baa estimated mean value ofii) 


Zp = state variable vector at stage k 

Yp = measurement vector at stage k 

Hp = measurement matrix, relating measurement Y; and state Z;, 
Wk = process noise at stage k 

vk = measurement noise at stage k 

Žk kt = prediction of Z% using observations up to stage k — 1, 
i.e., E (Zp |Y1, Y2, ,Yk—1) 

Zk k = estimation of Z% using observations up to stage k, 

i.e., E (Zp |Y1, Yo,--: , Yk) 

Pk k—1 = predicted state covariance matrix of Z;, at stage k — 1, 
1e., Var(Zp = Žk k1) 

Pk k = estimated state covariance matrix of Zņat stage k, i.e., Var(Zę — Žr) 


Transition Equation. 


The objective of the dynamic OD demand estimation and prediction problem is 
to find the time-dependent demand D,;,,)for origin-destination pair j at departure 
time interval 7. As discussed previously, the true demand D,,,-) can be partitioned 
into three components, namely, the regular pattern, structural deviations and random 
fluctuations. Theoretically, only the a priori estimate Di.) of the regular demand, 
reflecting prior survey data and surveillance information up to the previous day, is 
available before performing real-time estimation on the current day. For this reason, 
the true demand D(;,,) in the following study is modeled as a linear combination of 
the a priori estimate, structural deviation and random disturbance: 


Dor) = Do + Mar) + EGT) (16.1) 
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where the random disturbance term is assumed to follow a Normal distribution with 
zero mean. Moreover, a polynomial trend model is introduced to describe the struc- 
tural deviations based on the following assumption: 


Assumption 1. (Polynomial trend) Deviation at time 7 + ¢ can be adequately 
represented locally by an m*”-order polynomial function as Equation (16.2) near 
time 7 for a small value of ¢, while derivatives of higher orders are assumed to be 


(p) 


Zero: MG) =O forp>m. 


MGg,r+c) = bo + biG + B20? +++ + pC? +- + bmg”. (16.2) 


From Taylor’s theorem, the smooth function of 1(;,-4¢) can be expanded about the 
point p(;,-) as 


2 p m 
— 1 Ç " Ç (p) ¢ (m) 
MGg,r+6) = H(j,T) mig eres ag or HG7) Fet (p 0:7) apase Tl Gn (16.3) 


A comparison of Equations (16.2) and (16.3) indicates that the polynomial coeffi- 
cients in the original functional form can be obtained directly from 





(16.4) 


A more compact form for the pt”-order derivative of a polynomial can be generalized 


as ne 
(p) = 
HGr+¢) = > 


s=p 


¢(s-P) 
(s — p)! 





e (16.5) 


The corresponding matrix representation for a third-order polynomial model can be 
expressed as 


2 3 
H(t +0) 16S /o ii Pliye) 
MGre a 1 ¢ Spt E 1 (16.6) 
Mere) 1 ¢ Hir) 
H(j,r+2) 1 BG,7) 





The next assumption is required to allow time-varying trends to evolve stochastically 
between time stages. 


Assumption 2. (Evolution process) From stage k to stage k + 1, the change of 


(p) 


derivative MG.) Can be described as 


(p) = 7 
Mj r+l) 2 


s=p 


Ue?) sy (p) 
sap an F Wes) (16.7) 





where departure time index 7 = kl, and we? ~ N[0, we a 
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Taking a third-order polynomial trend model for OD pair 7 as an example, the 
corresponding transition equation in the Kalman filtering formulation is given in 
(16.8). More precisely, the state vector consists of the zero’” to m*”-order derivatives 
of demand structural deviations from the a priori regular demand pattern estimate for 
OD pair j. Note that the transition matrix is independent of the current stage k and 
related departure time interval 7. 


2 3 
H(j,r-41) 1l Vha lja H(j,r) Wr) 
H4 l Hg Wei 
Grey P| 1b Nyy Gp) Ge. (16.8) 
POr) 1 1 H(z) G7) 
HG,r+) 1 MG,r) G7) 





Consequently, the single OD-pair model above can be easily extended to consider all 
the OD pairs in a network. Considering a third-order polynomial filter with departure 
time T=kl at stage k, we can define the state vector as 


I / 1m 


Zk = (M(1,7)3 Mar) Har) Har): H(2,7)> H(2,7)? H27)» H(2,r)? a iene gy 
H(Noas7)) Boat)? Hoar)? B(Noat)) (16.9) 
and the transition matrix as 
Ar = Diag( AL A?,..., AJ... A) (16.10) 


where 


2, 73 
11a da 
žel 1i“ (16.11) 
1 l 
1 
forj =1,2,..., Noa. 
By assuming the evolution noise wg as 


a 1 1 UAA / 1 IH 
Wk = (wa,r)> War) Wa, r) Wa, r) W27) War) W(2,r) W(2,r) to 


/ n" my T 
W(Noast)) W(Noar)? W(Noa,1)? Y(Noast)) ° (16.12) 
the complete transition equation in real-time OD estimation and prediction can be 


written as 
Zk+1 = AkZk + Wk. (16.13) 


To obtain the future demand level with prediction horizon h, we need to first predict 
the demand deviation at time 7 + h based on estimated derivatives at the current 
stage, and then substitute the predicted demand deviation and the a priori estimate 
of the regular demand pattern into Equation (16.1). Thus, 


E [Dorn |o] = D7 an tE [Hare ugn] 


~p = h’ a (8) 
Dy Aa a Gd (16.14) 
s=0 
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where T=kl. 

By incorporating the a priori estimate of the regular demand pattern in the pro- 
posed structural model, one computational advantage is the reduction in the dimen- 
sion of the state variable vector. To show this, suppose the original demand can be 
adequately fitted by an s‘”-order polynomial model as 

Dejrte) = ao + 16 +207 +-+ + am” +++ + aC? + Egee (16.15) 
If Dr ies is a good approximation toD(;,-,¢), it can be further assumed that 
Dt also corresponds to an s‘”-order polynomial model satisfying (dp — a >) = 0 


CTE] 
for p > m, as 


= G0 +AT HHA H +a CP H + ane’. (16.16) 


ip 
(3,7+¢) 


Then, ignoring the higher-order terms from m+1 to s, we have 


Harto = Darto — Dp ero Z EG 
= (ao =) + (a1 — ČI)C + +++ + (am = Âm )C™ + + + (as = GEG 
= (ao — 45) + (aià) + (am — n G 











(16.17) 
and the resulting order of the polynomial model will be reduced from s to m. Since 
the computational complexity of the Kalman filter is on the order of O(N), where 
N = Noa x m in our case, the reduction of the model order from s to m dramatically 
decreases the size of the state vector, and therefore improves computational time 
efficiency. 

In addition, incorporating a reliable estimate of the regular demand pattern is 
always beneficial for improving estimation and prediction quality. From the linear 
regression standpoint, the regular daily pattern can be viewed as a good explana- 
tory regressor that absorbs a considerable amount of variation in the independent 
variable (i.e. true dynamic demand). Thus, compared to a pure polynomial model, 
the proposed structural model with the regular pattern component leads to smaller 
regression residual errors, that is, smaller estimation and prediction errors. 


Measurement Equations. In general, the measurement equation connects the link 
observations and OD demands through a link proportion matrix, as in Equation 
(16.18). Specifically, the link proportions map all the lagged and prevailing demands 
at the current stage k to n*l measurements for each link with available observations. 
I-1 Noa 
et) = DD (LPG. 6.240) * Darto) + uct) (16.18) 
¢=—-q j=l 
where r = kl, i=1, 2,..., Nobs and knl < t < (k+1)nl—1. 
To relate link measurements to the state variables constructed previously, substi- 
tuting Equations (16.1) and (16.3), the equation above becomes 


l—1 Noa 
C? (s) i: 
eit) = D> do | LPGnurt+o* 2 TG. + Domo + E00) | + UG): 


¢=—-qJj=1 
(16.19) 
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and it can be further transformed to 


1-1 Noa 


C(i,t) = 5 Y (LPenurtð sD) 
¢=—q j=1 
l—1 Noa 


=>, (2s une u Pa) 
¢=-qj=l ` 
l—1 Noa 


+ 0 DO LPG arto * Eitte) + Uaa (16.20) 
¢=—-q j=l 


Consequently, we can define the observation vector and measurement error in the 
Kalman formulation as follows: 


Yk = Ay Ze, + vp; (16.21) 


Yi = (YG ,kni) Y(,knl+1) -+ + Y(1,(k+1)nl—1) ++ 


Y(Novs knl) Y(Nobs,knl+1)> +- Wain) (16.22) 
where 
l-1 Noa 
van =a > (LPan, Gr+e) * DG, aal (16.23) 
¢=-qj=l 
Uk = (UCL kn), V(1,knl+1)> © 0+ V(1,(k+1)nl-1)> ++ +> 
UN opa kml)» U( Nope knl+1)> ++ +s U(Nobe,(k+1)nI—1))- (16.24) 
where 
1-1 Noa 
van = J 9 (EPG ,G.r40 * Ego) + UGH: (16.25) 
¢=-qj=l 


The final measurement error term in the transition equation combines the random 
noise in OD demand, other errors associated with link proportions, as well as the 
sensor errors in traffic measurements. 

The dimension of measurement matrix Hy is (N»5*nl,Noa*m), and its (i, t)”, 


(j, p)*” element is 


cP 
Ho»,Ge) = > (2P enue a (16.26) 
=q 


where T = kl. 

By applying a polynomial approximation for OD demands during departure time 
intervals from kl-q to (k+1)l-1, the polynomial trend filter neatly incorporates the 
lagged demands into the estimation procedure for the current stage, leading to an 
efficient state space representation, desirable for large-scale network applications. 
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Assumption 3. wx and v;, are white noise terms uncorrelated with the initial state 
Zo and with each other, where wg ~ N[0, Wg] and vg ~N(O, Vg). 

From Assumption 3, the Kalman filtering algorithm is ready to be integrated into 
the following recursive estimation and prediction algorithm. 

Algorithm 2. Real-time dynamic demand OD estimation and prediction 
Step 1: (Initialization) Set up initial estimates Poo = Var(Zo) and Zo,0 = E(Zo). 
Let k=1. 
Step 2: (Prediction) Propagate the mean and covariance estimates from k — 1 to k. 


Žr -1 = Arr- (16.27) 

Pk k-1 = AkPk-1,k-14p + We (16.28) 

Step 3: (Estimation of state variable) After receiving new link proportions and link 
observations, calculate the weighting matrix as 


Kp = Pre Hg (HgPk -1 HF + Ve)", (16.29) 
and then update the a posteriori mean and covariance estimates. 


Êr k = Êr k1 + Kk(Yr — He Zeer) (16.30) 

Pkk = (I — Kk Hp) Pk, k-1 (16.31) 

Step 4: (Estimation of real-time demand) Calculate the estimation of real-time de- 
mand using new estimates /1(;,,). 


Don = E (D7 p + uam + egm) = BS, + Aun (16.32) 


where Tr=kl,kl+1,..., (kK+1)I-1. 
Step 5: Advance roll period forward from k to k+1, and then go back to Step 2. 
Further, if independence of measurement errors is assumed, that is, 


vp ~ N(0, diag[ Va kni); Via,entt1)s+++9++ +9 ViNogs,(k+1)nt—1)])s (16.33) 


we can apply the scalar updating scheme described in Ashok [1] in order to avoid 
complicated matrix inversion in a real-time setting. 

In the context of short-term economic forecasting, the zero’”, first and second- 
order polynomial models can be viewed as the local level model, local linear growth 
model and local quadratic growth model, respectively. Regarding connections be- 
tween the polynomial trend models and other time-series ARIMA models, West and 
Harrison [34] demonstrated that, if restrictions are imposed on the autocorrelation 
structure, the limiting case of an (m+1)” polynomial trend model is equivalent to an 
ARIMA(O, m, m) model. According to the generalized state-space architecture pro- 
posed by Harvey [16], an auto-regression (AR) term can be also incorporated into 
the state variable vector to model the autocorrelation structure in the random distur- 
bance. It is worth remarking that, even if the underlying trends for demand structural 
deviations are negligible, it is advisable to embed a polynomial trend component in 
the space state representation so as to monitor and identify possible changes in the 
process structure. 
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16.4 Adaptive Day-To-Day Updating of Regular Demand Pattern 
Information 


As discussed earlier, the initial estimate for the regular demand pattern could 
be unreliable due to limited sample size, and the normal daily pattern could evolve 
smoothly due to day-to-day demand dynamics. Hence, it is necessary to update the a 
priori estimate using the new demand estimate and new observations. A desirable up- 
dating formulation should be able to adaptively recognize and capture the systematic 
day-to-day evolution, and also maintain robustness under disruptions due to special 
events. An updating formulation based on a Kalman filter framework is proposed. 
The notation used in the real-time OD estimation and prediction model is ex- 
tended to the day-to-day context as follows: 
d = index for day 
D} = state variable vector of regular OD demand pattern on day d,consisting of 
elements Dor) 
éa = day-to-day evolution variance on day d 
. D4 = vector of the real-time demand estimate on day d, consisting of estimates 
Don) 

na = Measurement variance matrix on day d 

Di, d_1 = predicted state variable vector D} using observations up to day d—1, 


consisting of elements D a 


Dr q = estimated state variable vector D” using observations up to day d 

Sid,d—1 = predicted state covariance matrix for the regular demand pattern on 
day d 

Xaa = estimated state covariance matrix for the regular demand pattern on day 
d 

Ka = Kalman gain matrix for using real-time demand estimates on day d 

Ma = vector of estimated demand deviations on day d, with elements /1(;,,) 

kc! = Kalman gain matrix for using real-time observations on day d 

Ca = vector of traffic observations on day d, consisting of elements c(i +) 

LP = link proportion matrix on day d, consisting of elements LPi t),(j,7) 

The transition and measurement equations for the day-to-day demand evolution 
can be written as 

Transition Equation: Dina = Dut Ea- (16.34) 


Measurement Equation: Dy = Di + na. (16.35) 


Assumption 4. £4 and ną are white noise terms uncorrelated with the initial state 
Dj and with each other, where £g ~N(0, Qa) and na ~N(O, Ra) 

According to transition equation (16.34), the regular demand pattern can evolve 
smoothly from day to day, where stochastic day-to-day evolution is captured by the 
evolution random term éq with zero mean. In the measurement equation, since the 
true demand state cannot be directly observed, the new real-time demand estimate 
Dg is considered as “measurement” incoming every day. Following the standard 
Kalman filtering algorithm, the updating procedure can be summarized as follows. 
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Algorithm 3. Day-to-day updating for regular demand pattern estimate 

Step 1: (Initialization) Set up Dr o and Xo,0 as the initial estimated mean and covari- 
ance of the regular demand. Let d=1. 

Step 2: (Computation of a priori estimate) The a posteriori estimate Dt, d—-1 ON 
previous day d-1 is used as the a priori estimate for current day d. The corresponding 
covariance matrix is updated by taking evolution noise into account: 


D =D (16.36) 
Sadi = Sig F Qa. (16.37) 


Step 3: (Real-time OD estimation and prediction) Run the real-time OD estimation 
and prediction module in conjunction with real-time DTA simulators, to obtain new 
estimates W4 and Dy for day d. 

Step 4: (Update of gain matrix) Compute the gain matrix using predicted state co- 
variance matrix and measurement variance matrix: 


Ka = Xa,a-1(Xa,a-1 + Ra). (16.38) 


Step 5: (Update of mean and covariance) Update the estimated mean and covariance 
matrix for the regular demand state vector: 





Da = Doa T Ka(Da T Di aa) 

= D; , , + Ka(Da— D3) 

= Dv, + KaMa, (16.39) 
Xaa = (I — Ka) Xa,a-1- (16.40) 


Step 6: Move to the next day, d = d+1, and then go back to Step 1. 

An important point is that Da- Dr q—ı in the day-to-day updating algorithm 
above is equivalent to Ma, which consists of the demand deviation estimate [M(j,7) 
generated from the real-time estimation and prediction algorithm on day d. It can be 
shown that aa < Xa,a—1, that is, the conditional demand estimate contains less 
uncertainty than the corresponding a priori estimate for the regular demand pattern, 
after incorporating additional information from the new real-time estimation result. 
More importantly, the recursive updating algorithm above naturally integrates with 
the proposed structural model in the previous section, and it is able to accumulate 
the information from the real-time estimator on a daily basis. 

In order to make this recursive algorithm operational, the next question is how to 
specify the values of evolution variance Qa and measurement variance Ra. By using 
the multi-day OD estimation method proposed by Zhou et al. [35], the variance of the 
measurement noise in Equation (16.35) can be obtained by evaluating the variance 
of estimated OD demands across several days. Determining the day-to-day variance 
is generally more difficult, since we cannot directly observe the day-to-day demand 
evolution process. Recognizing that the proposed day-to-day evolution process can 
be described as a random walk plus noise model, existing time series techniques are 
applied to choose appropriate values of process variance. A common approach is to 
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first assume a constant signal to noise ratio’ = Qa, indicating the ratio of inher- 
ent system variance with respect to observational variance. Based on calibrated Ra, 
Qa = ARa, so we can select an appropriate signal to noise ratio so as to minimize 
average prediction errors in the training data sets. The reader is referred to West and 
Harrison [34] for a comprehensive treatment. In early stages of applying this updat- 
ing mechanism, considerable uncertainty in the predicted state covariance X4,d—1 
results in a high gain factor, implying that the new real-time estimates receive rela- 
tively large weighting. After a certain number of iterations, the gain factor becomes 
stable as it is gradually reaching a steady state. If constant Q and R are assumed, the 
following limiting behavior for the Kalman gain matrix can be derived [34]: 


À 7 
Jim Ka= 5 (v +4- 1) (16.41) 


A typical value of À can be 0.05, leading to lim Kq = 0.2, so the most recent real- 


time estimate receives relatively small weighting eventually. If A=0.5, corresponding 
to a limiting gain factor of 0.5, then the a priori estimate and the new real-time 
estimate share equal importance in determining a priori demand information for the 
next day. 

Another approach to update the historical demand estimate is to directly utilize 
link observations instead of real-time demand estimates. The modified transition and 
measurement equations are given in Equations (16.42) and (16.43): 


Transition Equation: 
Dia = Dut Ea- (16.42) 
Measurement Equation: 
Ca = LPD% +m. (16.43) 
Assumption 5. £4 and n} are white noise terms uncorrelated with the initial state DG 
and with each other, where £4 ~N(0, Qa) and n4 ~N(0, R’,). 

Note that the transition equation above is identical to the one using real-time es- 
timates. As the measurement equation uses link proportions to link real-world mea- 
surements and the regular demand pattern, the measurement variance R/, should be 
recalibrated accordingly. The gain matrix, mean and covariance updating formula- 
tion in Steps 4 and 5 of Algorithm 3 should also be changed to the following: 


Ki = Sg a LP? (EP yg bP? +R), (16.44) 
Dig = D} a1 + K4(Ca — LP4Di a1), (16.45) 
Xaa = (I — KL 44-1: (16.46) 


This Kalman filtering formulation provides a least-squares unbiased estimator for the 
regular demand pattern, with the optimal weights on the a priori estimate and new 
information. It is important to note that this recursive prediction-correction algorithm 
only requires a priori mean and covariance statistics at each iteration instead of the 
entire historical data series, resulting in efficient storage implementation for on-line 
applications. Practically, this updating method can be viewed as a moving average 
method with adaptive weights, depending on the respective reliability of the a priori 
and real-time information sources. 
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16.5 Application to Irvine Network Using PeMS Data 


Numerical experiments are presented to illustrate the application of the proposed 
model and algorithms to the Irvine, CA test bed traffic network, which consists of 
three freeway corridors (I-5, I-405, Highway 133) and other main arterials. As shown 
in Figure 16.2, this network includes 61 OD zones, 326 nodes and 626 links, where 
traffic counts are measured at 30-second intervals on 19 freeway links and at 5- 
minute intervals on 28 arterial links. In addition, the a priori estimate of the regular 
demand pattern is constructed by the off-line OD estimation method [30],[35] using 
the first day data. Real-world observations on the second day are used to calibrate 
the system and measurement variances in the real-time OD estimation and prediction 
model. The third day data are used to validate the proposed real-time OD estimation 
and prediction algorithm. The time of interest in the following experiments is the 
morning peak period (4:00 AM — 10:00 AM), while the demand departure time in- 
terval and roll period are 5 and 15 minutes, respectively. 

First, a first-order polynomial trend model (i.e. local linear trend model) is ap- 
plied to estimate the demand deviations from the a priori estimate of the regular 
demand pattern. In Figure 16.3, the a priori regular pattern estimate and the corre- 
sponding demand deviations are displayed for the OD pair from zone 53 to zone 40, 
which carries the highest trips for all among all the OD pairs in the study network. 
On average, the a priori demand data underestimates the real-time demand on the 
third day for this selected OD pair, but the prior information still shows similar time- 
varying dynamic patterns. As expected, the demand deviations exhibit a much slower 
changing pattern than the corresponding real-time demands over the same time. Es- 
sentially, the estimated structural demand deviations are caused by the day-to-day 
dynamics, but the deviations shown in this case can be also due to the estimation 
noise in the a priori demand data, which only utilizes one-day observations. 

Considering the smooth trend for demand deviations in Figure 16.3, it is desirable 
to further reduce the first-order polynomial model to the zero“”-order polynomial 
model. To assess and compare the estimation performance of alternative models, we 
define root mean square (RMS) error in density as 





RMS, = (16.47) 





where c; ¿= density measured on link 7, during observation interval t, and ¢; 4 = 
simulated density from the real-time DTA estimator on link 7, during observation 
interval t. 

The RSE errors at every 5 minutes are plotted in Figure 16.4, for the zero’” and 
first-order polynomial models, respectively. The average RMS error of the zerot”- 
order model during the study horizon (10.2608) is marginally greater than the av- 
erage RMS error of the first-order model (10.8588) by 1.8%. Basically, these two 
models perform better in the early morning (from 4:00 AM to 6:00 AM), compared 
to the peak hour period (from 7:00 AM to 9:00 AM). Such time-dependent RMS 
measures can be explained by the increasing variability in the peak hour demands 
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q = OD pair from zone 53 to zone 40 : \ 


Fig. 16.2. Irvine network. 


and the high dynamics in the related traffic flow propagation processes. Based on 
experiment results from the third day data, the zero’” order polynomial model seems 
to be more attractive than the first-order model, since it offers acceptable accuracy 
with considerable enhanced computation efficiency. On the other hand, if real-time 
response constraints can be satisfied, keeping a high order polynomial model is al- 
ways preferable, because it can capture nonlinear patterns in the demand structure 
changes. 

Figure 16.5 plots simulated density, predicted density and the observed density 
on link 212, using a 20-minute prediction horizon. Specifically, the simulated density 
and the predicted density are generated from the DTA network state estimation mod- 
ule and the DTA state prediction module, respectively. Link 212 is a freeway link 
going northbound, and its location is marked in Figure 16.2. The DTA network state 
estimator is able to capture the time-varying trends of real-world traffic, while the 
DTA network state predictor can forecast dynamic flow propagation with acceptable 
quality. The results above further validate the effectiveness of the proposed real- 
time OD estimation and prediction framework, and illustrate the role of the PeMS 
database in enabling and improving this process. 
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Fig. 16.4. RMS errors in density for polynomial models. 
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Fig. 16.5. Observed density vs. simulated density and predicted density on link 212. 


16.6 Concluding Comments 


Real-time OD estimation and prediction is an important component in real-time 
dynamic traffic assignment for ATMS/ATIS network applications. This chapter ex- 
ploits the potential of using a structural state space model to systematically incor- 
porate regular demand pattern information, structural deviations and random fluc- 
tuations. The contributions of this study include the following. First, a polynomial 
trend filter is developed to estimate and predict demand deviations from the a priori 
estimate of the regular demand pattern, so as to utilize valuable historical informa- 
tion and adaptively respond to possible structural deviations in demands. Second, 
based on a Kalman filtering framework, an optimal recursive procedure is proposed 
for updating the regular demand pattern estimate with new real-time estimates and 
observations obtained every day. These models can be naturally integrated into the 
real-time DTA system and provide an effective and efficient approach to utilize the 
real-time traffic data continuously in the operational settings. Third, the application 
to the Irvine network provides an illustration of the usefulness of the PeMS data to 
calibrate and test advanced network modeling tools intended for ITS planning and 
operation. 

One particularly attractive opportunity would be to integrate our dynamic net- 
work modeling capability explicitly with PeMS into the decision support system. The 
day-to-day updating framework would then be used to regularly update the modeling 
tools, keeping an up-to-the-minute version ready for application to analyze scenarios 
and contemplated actions for unfolding conditions. An important capability in this 
process would be the identification, through the extensive PeMS data warehouse, of 
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most comparable days or patterns in order to provide a starting point, to be updated 
using the methods presented in this work. 

Through the efforts of Varaiya, his students and collaborators, the theory and 
practice of traffic systems engineering is making strides towards greater levels of 
intelligence in terms of information on prevailing conditions as well as dynamic 
management strategies. 
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17.1 Introduction 


Freeway traffic congestion is a major problem in today’s metropolitan areas. It 
occurs regularly during commute hours. In addition, non-recurrent congestion often 
takes place as a result of incidents, road work, or public events. Congestion leads to 
a variety of detrimental effects such as inefficient operation of freeways, wasted re- 
sources, increased pollution, and intensified driver fatigue. The 2004 Urban Mobility 
Report [1] finds: “Congestion has grown everywhere in areas of all sizes. Congestion 
occurs during longer portions of the day and delays more travelers and goods than 
ever before.” It was estimated in this report that, in 2002, congestion cost Americans 
3.5 billion hours of delay and 5.7 billion gallons of wasted fuel, with an equivalent 
monetary cost of U.S. $63.2 billion. On-ramp metering has been widely used as an 
effective strategy to increase freeway operation efficiency. It has been recommended 
to the U.S. Federal Highway Administration as the No. | tool to address the conges- 
tion problem, other than adding more capacity to transportation infrastructures [2]. 
It has been reported that ramp metering was able to reduce delay by 101 million 
person-hours in 2002, approximately 5% of the congestion delay on freeways where 
ramp-metering was in effect [1]. 

Accurate freeway traffic models are extremely valuable tools for the design and 
evaluation of on-ramp metering strategies. However, the development and calibra- 
tion of these models, particularly those that are microscopic, is often laborious and 
time-consuming. To help fulfill the goal of providing an accurate, computationally 
efficient and easy to calibrate model for the development and analysis of on-ramp 
metering algorithms, a piecewise-linearized version of Daganzo’s macroscopic Cell 
Transmission Model (CTM) [3, 4], called the Switching-Mode Model (SMM), has 
been developed [5], and will be discussed in Section 17.2.2. Its linear structure lends 
the advantage of simplifying control analysis, design, and data-estimation methods. 
The CTM, which is briefly described in Section 17.2.1, has many favorable features, 
particularly its simplicity, ease of calibration, and its ability to reproduce important 
traffic phenomena such as shock wave propagation. These properties are also in- 
herited by the SMM. Both the CTM and SMM have been shown to perform well 
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in describing traffic behavior when tested with data from a 2-mile portion of a 14- 
mile test section of Interstate 210 Westbound (1-210W) in southern California [5]. 
Furthermore, the observability and controllability properties of the SMM, which are 
of fundamental importance in the design of traffic data estimators and freeway on- 
ramp control systems, can be derived using standard linear systems techniques, as 
discussed in Section 17.2.3. Moreover, in order to facilitate the calibration process 
of these models, a semi-automated calibration methodology for estimating the CTM 
and SMM parameters has been developed [6, 7], as described in Section 17.3. The 
semi-automated calibration procedure was also tested with data from the same 14- 
mile segment of I-210W, which typically endures heavy congestion during the week- 
day morning commute period, and the calibrated CTM was shown to reproduce ob- 
served bottleneck locations and the qualitative evolution of traffic congestion, yield- 
ing an average of about 2% error in the predicted partial total travel time [6, 7]. 

To effectively control freeway traffic, it is desirable that traffic data, such as flow 
and density, be available continuously across time and space. However, because of 
the high cost and difficulty of installing and maintaining loop detectors, oftentimes 
traffic data are not available at all desired locations at all times and, as a conse- 
quence, missing data must be estimated using available data from other locations. 
On the other hand, the traffic congestion mode, i.e., whether the traffic is flowing 
freely or is congested, cannot be measured directly and has to be inferred from other 
quantities. For these purposes, a traffic state estimator has been developed based on 
the SMM [8, 9], using a modified mixture Kalman filter (MKF) algorithm [10], as 
described in Section 17.4. This estimator is able to estimate the vehicle densities at 
unmeasured locations, as well as determine the traffic congestion mode in a freeway 
section. It was tested on a 2-mile section of freeway and its performance was evalu- 
ated using the measured data. It was shown that on average, a mean percentage error 
of ~10% was achieved for vehicle density estimation at unmeasured locations. The 
MKF-based traffic state estimator was implemented on our entire selected 14-mi I- 
210W test site, as well as interfaced with both a calibrated CTM simulator [6] and a 
calibrated VISSIM microscopic simulator [11]. 

Section 17.5 discusses on-ramp metering control strategies. Traffic-responsive 
and decentralized on-ramp metering control schemes, such as the ALINEA algo- 
rithm of Papageorgiou [12], in which each on-ramp controller only utilizes traffic 
measurements from the freeway mainline near the on-ramp merge point, have been 
shown to be effective and in many instances perform more robustly than coordinated 
metering schemes. However, because traffic dynamics behave differently under free- 
flow or congested conditions, leading to a freeway segment having different control- 
lability and observability properties depending on its congestion state [5], it may be 
advantageous to change the structure of the localized controller to suit the current 
controllability and observability properties of the segment. Based on this observa- 
tion, a traffic responsive and decentralized switching ramp-metering controller has 
been developed [13, 14], which employs a different feedback structure depending 
on whether the mainline segment at the on-ramp merge junction is in a free-flow or 
congested mode, and is briefly described in Section 17.5.1. 
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A requirement that is often imposed on ramp-metering schemes is that queues 
must not exceed the storage capacity of the on-ramp, in order to prevent the queue 
from spilling over and inducing congestion in surface streets. It has been ob- 
served [15, 16] that the “queue-override” scheme that is most-frequently used in 
the U.S. leads to oscillatory behavior and under-utilization of on-ramp storage ca- 
pacities. To address this problem, [16] proposed the use of a proportional on-ramp 
queue controller that requires knowledge of the on-ramp vehicle demand, which may 
not be available in real-time in the field. To avoid the use of on-ramp vehicle de- 
mand measurements, a proportional and integral queue length regulator has been 
proposed [13, 14], which estimates the length of the queue by measuring vehicle 
speeds as they approach the queue, and is discussed in Sections 17.5.2 and 17.5.3. 

Section 17.5.4 discusses an overall localized and traffic-responsive on-ramp con- 
trol strategy that reduces the spatial and temporal span of the congestion while main- 
taining on-ramp queues within on-ramp storage capacities. This control strategy 
switches between the mainline density ramp-metering controller in Section 17.5.1 
and the queue length regulator in Section 17.5.2, by choosing the controller that pro- 
duces the less restrictive metering strategy at every sampling instance. Test results 
of the use of this control strategy on a calibrated microscopic traffic simulator are 
presented and compared with results obtained when ALINEA [12] is used instead of 
the switching ramp-metering controller of Section 17.5.1. 

Conclusions and final remarks are presented in Section 17.6. 


17.2 Macroscopic Modeling and Analysis of Freeway Traffic 
Dynamics 


Two fundamentally different approaches have typically been applied in order to 
model traffic dynamics. The microscopic approach seeks to reproduce the behavior 
of the individual driver/vehicle unit, as it responds to its environment by adjusting 
its speed. The macroscopic approach, on the other hand, ignores the dynamics of the 
individual driver and instead attempts to replicate the aggregate response of a large 
number of vehicles. In this section we first briefly review the cell transmission model 
(CTM) introduced by Daganzo [3, 4], which is a finite difference approximation of 
the well-known Lighthill, Whitham and Richards (LWR) model [17, 18] and is based 
on the intuitive concepts of sending and receiving flows. Subsequently, we describe a 
piecewise-linearized version of the CTM, called the switching-mode model (SMM), 
where the traffic dynamics in each segment of the freeway is modeled by a hybrid 
system that switches among several sets of linear difference equations, depending on 
the congestion status of the cells in the segment and the boundary conditions of the 
segment. The SMM forms the basis for most of the real-time estimation and on-ramp 
metering control algorithms that are discussed in this chapter. 
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17.2.1 Cell Transmission Model (CTM) 


In Daganzo’s CTM, a freeway is partitioned into a series of cells. An example is 
shown in Fig. 17.1(a), where it is assumed that nonuniform cell lengths are allowed. 
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Fig. 17.1. (a) Freeway segment partitioned into cells: 1—4 on mainline, cell OR on on-ramp, 
and cell FR on off-ramp; (b) Trapezoidal fundamental diagram. 


The traffic density in cell ¿ evolves according to conservation of vehicles: 
pilk +1) = pi(k) + F (diin(k) — Gi,out(k)) (17.1) 


where qi in(k) and qi out(k) are, respectively, the total flows, in vehicles per unit 
time, entering and leaving cell 7 during the k" time interval, T, [k, k +1), including 
flows along the mainline and the on- and off-ramps. k is the time index, T, is the 
discrete time interval, l; is the length of cell i, and p;(k) is the density, in vehicles per 
unit length of freeway, in cell 7 at time k Ts. The model parameters include v, w, Qm, 
and pz, which are depicted in the trapezoidal fundamental diagram of Fig. 17.1(b). 
The parameters can be uniform over all cells or allowed to vary from cell to cell. For 
reference, the parameters are defined as: 


v ... free-flow speed (mph) 

w ... backward congestion wave speed (mph) 
Qm ... maximum allowable flow (veb/hr, i.e., vph) 
py ... jam density (veh/mi, i.e., vpm) 

Pe... critical density (vpm) 


The left part of the fundamental diagram of Fig. 17.1(b), where Q(p) = vp, is 
an approximation of the typical behavior of free-flow traffic, whereas the right side 
(Q(p) = w(pz — p)) is associated with congested traffic. 

Three different types of intercell connection are allowed: simple connection, 
merge, and diverge. In a simple connection, two cells are connected to one an- 
other without any intervening on-ramps or off-ramps (for example, cells 2 and 3 in 
Fig. 17.1(a)) . Let 2 — 1 be the upstream cell and 7 be the downstream cell in the pair. 
As described in [4], q:(k), the flow entering cell i from the mainline, is determined 
by taking the minimum of two quantities: 
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qi(k) = min($;_1(k), R,(k)), (17.2) 
Si-ı(k) = min(vi-1pi-1(k), Qua), (17.3) 
Ri(k) = min( Qari, wif pJi — pi(k))), (17.4) 


where S;_1(k) is the maximum flow that can be supplied by cell i — 1 under free- 
flow conditions, over the k" time interval, and R;(k) is the maximum flow that can 
be received by cell i under congested conditions, over the same time interval. 

In the form presented here, the CTM also uses density-based versions of the 
merge and diverge laws of [4] to incorporate on-ramp and off-ramp flows; a complete 
statement of these merge and diverge laws can be found in [7]. A merge and diverge 
are shown within the context of a freeway segment in Fig. 17.1(a), where q2 and r 
are the flows merging into cell 2 from cells 1 and OR, and q4 and f are the flows 
diverging into cells 4 and FR from cell 3. The diverging flows are defined as qa(k) = 
(1 — B(k))¢3,oue(k), and f(k) = B(k)q3,out(k), where 3(k) is the split ratio for the 
diverge junction, i.e., the fraction of vehicles leaving cell 3 which exits to cell FR 
during the k" time interval. It is assumed here that the split ratios can be determined 
externally to the model as functions of time. 

The CTM is subject to the same intercell connectivity restrictions as those de- 
scribed in [4], such as limiting the maximum number of separate flow streams enter- 
ing any cell to 2. Another requirement of the CTM is that the cell lengths must be 
longer than the free-flow travel distance, i.e., 


vils < li, (17.5) 


for cell 7. The necessity of this condition for convergence of CTM solutions to 
LWR model solutions is explained in [19]. The CTM consists of flow conservation, 
Eq. (17.1), for each cell, along with the flow relations, Eqs. (17.2)-(17.4) and the 
merge and diverge laws. The state vector is p = [p1...pn]" for a freeway parti- 
tioned into N cells. 


17.2.2 Switching-Mode Model 


In order to gain additional insight into freeway traffic behavior, and to sim- 
plify the control analysis, control design, and data-estimation design methods, a 
piecewise-linearized version of the CTM, called the switching-mode model (SMM), 
has been designed [5]. Since the SMM is composed of several linear models, straight- 
forward linear techniques for model analysis and control design can be applied to the 
individual linear subsystems. 

The SMM is a hybrid system that switches among five sets of linear difference 
equations, depending on the congestion status of the cells and the values of the 
mainline boundary data. Assuming the state vector is composed of the cell densi- 
ties, p = [p1...py]*, the key difference between the CTM and the SMM is that, 
with respect to density, the former is nonlinear, whereas each mode of the latter is 
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linear. The SMM can be extracted from the CTM by writing each inter-cellular flow 
as either an explicit function of cell density, or as a constant. For example, in the case 
of a segment without merges or diverges, each q; would be replaced with vp;—1(k), 
w(pz — pi(k)), or Qm. This explicit density dependence is achieved by supplying 
a set of logical rules that determine the congestion status of each cell, at every time 
step, based on measurements at the segment boundaries. 

For simplicity, the following assumptions are made: 


1. The densities and flows at the upstream and downstream segment boundaries, as 
well as flows on all the on-ramps, are measured. 

2. There is at most one status transition (or wave front) in the freeway section. If 
both the upstream and downstream mainline boundaries are of the same status, 
i.e., both free-flow or both congested, it is assumed that all the mainline cells, 
1 through NV, have the same status, while if the two boundaries are of different 
status, there exists a single wave front in the segment, upstream of which all the 
cells have congested (free-flow) status, and downstream of which all cells have 
free-flow (congested) status. 


The single-wave front assumption is an approximation that is expected to be 
acceptable for short freeway segments with only one on-ramp and off-ramp, such as 
the example later in this section. To more accurately deal with longer sections with 
many on- and off-ramps, the switching logic can be modified to allow multiple wave 
fronts within a segment. 

Since an SMM-modeled section contains at most one congestion wave front, 
the modes of the SMM can be distinguished by the congestion status of the cells 
upstream and downstream of the wave front. If there is no wave front in the section, 
a repeated label, e.g., “Free-flow—Free-flow’”, can be used to indicate the absence of 
any status transition. The five modes are denoted: (1) “Free-flow—Free-flow” (FF), (2) 
“Congestion—Congestion” (CC), (3) “Congestion—Free-flow” (CF), (4) “Free-flow— 
Congestion 1” (FC1), and (5) “Free-flow—Congestion 2” (FC2). The two modes of 
“Free-flow—Congestion” are determined by the relative magnitudes of the supplied 
flow of the last uncongested cell upstream of the wave front and the received flow 
of the first congested cell downstream of the wave front. If the former is smaller, 
the SMM is in FC1, while if the latter is smaller, it is in FC2. Respectively, these 
two cases are distinguished by whether the congestion wave is traveling forward or 
backward within the segment. 

Consider the freeway segment in Fig. 17.1 (a), which contains 4 mainline cells. 
The on- and off-ramps will not be modeled as cells in this case. The measured ag- 
gregate flows and densities at the upstream and downstream mainline detectors are 
denoted by qu, Pu, and qa, pa. All five modes of the SMM can be summarized as 
follows: 


p(k+1) =A; p(k) + Bs u(k)+ Bjs ps + Ba,s IM; (17.6) 
where s = 1,2,3,4,5 indicates the mode (1: FF, 2: CC, 3: CF, 4: FC1, 5: FC2), 
p = [p ... pa]? is the state, and u = [qu r2 pal’ are the flow and density in- 


puts; specifically, r2 is the measured on-ramp flow entering the section, subscripted 
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according to its cell of entry. py = [p71 ps2 P73 Psa pss? is the vector of jam 
densities, and qm = [Qui Qu2 Qu Qual" is the vector of maximum flow rates. 
Eq. (17.6) can alternatively be written with the downstream flow, qa, as an input in- 
stead of pa, but the stated form has the advantage of eliminating the By, py term 
for modes with downstream congestion, in the case where w; and pj; are the same 
for each cell. 

In FF mode, the flow across each cell boundary is dictated by upstream condi- 
tions; specifically, each cell releases traffic at the free-flow rate according to the first 
term in Eq. (17.3). That is, the total flow exiting cell 2 is given by vipi. The flow 
across the upstream boundary of cell 1 is qu. In the case of time-varying split ratios, 
the FF-mode state matrices (for the segment of Fig. 17.1(a)) are 


Tania 0 0 0 
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For the CC mode, the flow across each cell boundary is dictated by downstream 
conditions; specifically, each cell absorbs flow according to the second term in 
Eq. (17.4). That is, the total flow entering cell 7 is given by wi(pzi — pi). The 
flow released by cell 4 is determined by the downstream density pg. The split-ratio- 
dependent matrices are 


1 = wiıTs wəTs 0 0 
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Since the FF and CC modes are of primary importance in the estimator and con- 
troller designs of Sections 17.4 and 17.5, a discussion of the mixed modes (CF, FC1, 
and FC2) is omitted here. For a complete description of the mixed modes, please 
see [5, 7]. A set of switching rules were developed [5] to determine the mode of the 
system at each time step, based on the measured mainline boundary data and the 
congestion status of the cells in the section. If both p„ and pg have free-flow status 
(for a triangular fundamental diagram, this means they are below the critical density 
Pc for that region), the FF mode is selected, and if both of these densities are con- 
gested, the CC mode is selected. If p,, and pq are of opposite status, then the SMM 
performs a search over the p; to determine whether there is a status transition inside 
the section. This wave front search consists of searching through the cells, in order, 
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looking for the first status transition between adjacent cells. The CTM and SMM 
have been validated for an approximately 2-mi section of I-210W, where they were 
shown to yield estimation errors of about 13% for density and 4% for flow [5, 7]. 


17.2.3 Observability and controllability of the SMM 


Observability and controllability properties have been derived for the SMM 
modes, using standard techniques for time-varying linear systems. The middle col- 
umn of Table 17.1 summarizes the observability for each SMM mode, assuming 
that mainline detectors are only available at the upstream and downstream segment 
boundaries, as in Fig. 17.1(a). On the left side, “upst. cells” and “downst. cells” give 
the status of cells both upstream and downstream of the congestion wave front. If 
there is no such wave front, both sets of cells have the same status. The middle col- 
umn indicates which of the two mainline boundary measurements, if either, can be 
used to make the mode observable. These results can be obtained by computing the 
observability grammians for the As(k) with the output matrices Cu = [1 0 0 0] and 
Ca = [000 1]. 

From the table, it can be seen, as a general result, that if all cells have free- 
flow status, the densities are observable using a downstream measurement, while 
in congested mode, they are observable using an upstream measurement. This is 
related to the wave (information) propagation directions on a freeway in different 
congestion modes. When a freeway section is in free-flow mode, the information 
propagates downstream at speed v, which is the vehicle traveling speed. Therefore, 
in order to be able to estimate the cell densities, the downstream density measurement 
is needed. When the freeway is in congestion, the information propagates upstream 
at speed w, which is the backward congestion wave traveling speed, and an upstream 
measurement is needed to estimate densities. 


Table 17.1. Observability and controllability for different SMM modes (“OR” indicates on- 
ramp). 








Upst. Cells | Downst. Cells Observable with Controllable using 
Free-flow Free-flow Downst. Measurement Upst. On-Ramp 
Congested Congested Upst. Measurement Downst. On-Ramp 
Congested Free-flow Upst. & Downst. Meas. Not Controllable 
Free-flow Congested 1 Unobservable Upst. & Downst. OR 
Free-flow Congested 2 Unobservable Upst. & Downst. OR 











Controllability can be analyzed analogously, and the results are summarized in 
the last column of Table 17.1. Generally, a section in free-flow mode is controllable 
from an on-ramp at its upstream end, whereas a congested section can be controlled 
from an on-ramp at its downstream end. The observability and controllability of the 
mixed modes (CF, FC1, and FC2), along with derivations of the results for all modes, 
are discussed in more detail in [7]. 
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17.3 Calibration of the Cell Transmission and Switching Mode 
Model Parameters to the I-210 Testbed 


In this section, a methodology for tuning the CTM and SMM parameters to re- 
produce observed freeway traffic behavior is described. The calibration method has 
been tested on a 14-mile stretch of Interstate 210 Westbound (I-210W) in Pasadena, 
California, which is shown in the map in Fig. 17.2, and typically endures heavy con- 
gestion during the weekday morning commute period. 
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Fig. 17.2. A map of the I-210W testbed. Composed using the U.S. Census Bureau 2004 
TIGER/Line® data. 
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17.3.1 Freeway representation 


The 14-mile I-210W test segment has been divided into 41 cells, as shown in 
Fig. 17.3. This partition was adapted from a 40-cell partition that was designed by 
Gabriel Gomes for use in the optimization work of [20, 21]. The traffic flow di- 
rection is in order of increasing cell index, i.e., left to right, starting at the top of 
the figure. The cell index is located in the center of each cell. The uppermost row 
of numbers above the cells is the cell length (in feet). The second row of numbers 
gives the number of mixed-flow lanes (4 to 6) in each cell. Vertical gray bars mark 
the locations of the mainline loop detectors, and the postmile of the detector (e.g., 
39.159) is listed above the detector marker. On- and off-ramps are depicted as num- 
bered arrows. Associated street names are given for each set of ramps. A single 
high-occupancy vehicle (HOV) lane runs parallel to the leftmost mainline lane on 
this segment of I-210W. Each of the six HOV-lane gates is indicated by a horizontal 
gray bar, and an additional fictitious on-ramp (no. 22) was used to approximate the 
flow of vehicles entering the mixed-flow lanes from the Lake Ave. HOV gate. 

In the default partitioning method, cell boundaries are placed on the mainline im- 
mediately upstream of on-ramps and immediately downstream of off-ramps. How- 
ever, for the chosen model time step of 10 sec., and a typical free flow speed of 
63 mph, three of the cells were found to be shorter than the minimum allowed cell 
length of 924 ft; to satisfy this constraint, several adjustments (described in [6, 7]) 
were made to the asterisk-marked cells. 
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Fig. 17.3. 41-cell partition of I-210W testbed. Adapted from a 40-cell partition developed by 
Gabriel Gomes for use in optimization studies related to [20, 21]. 


Traffic data used in the model calibration was mostly obtained from the PeMS 
website [22], developed by Varaiya and his associates. In some few instances, where 
PeMS data was incomplete, demands were reconstructed using a set of manually- 
counted I-210W ramp flows provided by Caltrans. The reader is referred to [6, 7] 
for details on data processing, demand reconstruction, split ratio estimation, HOV 
modeling, and applicable simplified cases of the merge and diverge laws. 


17.3.2 Calibration methodology 


The main steps of the calibration procedure are as follows: 

1. Free-flow Parameter Calibration: The free-flow traffic velocities, v;, are 
determined by performing a least-squares fit on the flow versus density data over 
the period 5:00-6:00AM. For the I-210 section, traffic typically flows freely during 
this period. For the jt” detector, vj is the solution, in the least-squares sense, to the 
equation jv; = Y;, where ®; and Y; are column vectors that respectively contain 
densities and flows measured over the specified time interval. The free-flow speed v; 
is assigned to the cell containing detector j, and free-flow speeds are computed for 
non-detector cells by linear interpolation. 

2. Bottleneck Identification: Bottleneck locations are identified by examining 
contour plots of the measured traffic densities and/or speeds, and determining the 
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locations of fixed spatial boundaries which divide the freeway into an upstream con- 
gested region and a downstream free-flow region. For example, in the top plot of 
Fig. 17.4, a bottleneck was observed to form between the detectors at 33.049 and 
32.199 during the 6:00 time slice. 

3. Non-Bottleneck Capacity Selection: A set of nominal Q m, are assigned to 
the cells that are not located at bottlenecks. It is not advisable to set Q m; equal to the 
maximum observed flow at each detector-equipped cell, since this will most likely 
result in underestimating the true capacity of the freeway. Typically, the nominal 
Qm, must be chosen to be larger than the maximum observed flows (usually > 
2000 veh/hr per lane (vphpl)) in each region of the freeway. 

4. Bottleneck Capacity Determination: Consider a freeway portion divided into 
two consecutive cells, 1 and 2, where an active bottleneck exists between the two 
cells, hence the upstream cell is congested, while the downstream cell remains in 
free-flow status. Further assume that an on-ramp (with merging flow rg entering cell 
2) exists between the two cells. It can be shown that the bottleneck capacity in this 
situation is represented by Q 7,2 = q2 + r2, where q is the flow entering cell 2 from 
the mainline [6, 7]. Since both q2 and r2 are measurable, these quantities are used 
to estimate the bottleneck flow rate, with the default method (assuming no faulty or 
missing data) being Qu2 = meange ry, (q2(k) + 12(k)). Km corresponds to the 
half-hour time interval ending at arg max(q2(k) + r2(k)). 

5. Congestion Parameter Calibration: First, the critical density is estimated for 
each detector; Pe j = max, (qa,(k))/v;, where qa, is the flow measured at 

detector 7. Then, the flow and density measurements are sorted so that only con- 
gested pairs are used in the parameter estimation. The congested-mode equations 
can be rewritten so they are linear in [w; w,p,, Pile By substituting in the congested 
(pa; (k), qa; (k)) measurements, and applying the constraint Qu,j < aces to 
preserve the maximum flow rates determined in previous steps, the estimated pa- 
rameters are derived from the [w; w,p.,j| that solves the constrained least-squares 
problem. Additional details on constraints and interpolation methods are provided 
in [6, 7]. 

6. Time-Varying Parameter Adjustments: If necessary, temporary parameter 
changes (e.g., reduction of Q m, in a region) can be applied to reproduce the effect 
of an incident. Also, by reducing w; in the mid-morning time range, when the traffic 
is still congested but beginning its recovery back to the free-flow mode, the effect of 
flow-density hysteresis can be approximated. 


17.3.3 Results and discussion 


Fig. 17.4 shows contour plots for the measured (top) and simulated CTM (bot- 
tom) densities for a particular day (Wednesday, Nov. 28, 2001) in the I-210 testbed. 


The numbers inside the shaded cells are traffic densities, in vehicles per mile 
per lane (vpmpl). Free-flow densities (0-33 vpmpl) are shown as white. Mid-range 
congestion (33-43 vpmpl) is medium gray. Dark gray indicates heavy congestion 
(43 vpmpl or greater). Traffic is flowing from left to right in these plots, and the 
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Fig. 17.4. Contour plots of 15-minute average measured densities (top) and simulated CTM 
densities (bottom), vpmpl, for Nov. 28, 2001. 


time, in 15-minute intervals, is given in the leftmost column. The time range is 5:30- 
10:30AM. Loop detector outages are indicated by crossed-out boxes in the measured- 
data contour plot. Loop detectors which were suspected to be faulty for the whole 
day have their postmile labels surrounded by a dashed box at the top of the measured- 
data plot. If a detector was classified as faulty due to outages in some, but not all, 
of the lanes,the corresponding “measured” density displayed in the contour plot is a 
scaling-reconstructed estimate. Details regarding the values and consistency of the 
estimated parameters obtained with this method can be found in [6, 7]. 

To evaluate the performance of the simulation, we define a partial Total Travel 
Time: TTT = T, DDA Juico, lipi(k). Here, Ca is the set of cells which had 
problem-free mainline detectors over each of the examined days. Cy excludes de- 
tectors at postmiles 38.209, 38.069, 34.049, 30.779, 30.139, 29.999, 29.879, 28.030, 
26.800, and 25.400. Although it functions properly, the detector at 39.159 is also 
excluded, since the CTM boundary condition prevents the model from accurately 
reproducing congestion that (in the real system) spills upstream outside of the sim- 
ulated region. Results for TTT are summarized in Table 17.2, along with the spatial 
mean of the mean percentage error (MMPE) at the non-excluded detectors, defined 


M i(k)—pi(k 
as EMMPE = 100 x Noy Vieca | a eee ) > 


of non-excluded detectors (11 out of 22 in this case). The resulting values of MMPE 
are not surprising, since they are similar to the MPEs in the short-segment tests of [5]. 


is the number 
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Table 17.2. Total Travel Time (veh-hr) and mean MPE for three different days. 























Date Meas. | Sim. | % TTT Err. | MMPE 
Jan. 10, 2002 3778 | 3766 —0.32 14.2 
Nov. 28, 2001 4273 | 4278 0.01 14.2 
Nov. 13, 2001 4163 | 3961 —4.85 16.1 
mean 4071 | 4002 —1.72 14.8 
std. dev. 260 258 2.71 1.1 




















From Table 17.2, it can be seen that the CTM reproduces the observed bottlenecks 
and the approximate duration and spatial extent of the congestion upstream of each 
bottleneck, and predicts the total travel time with approximately 2% mean error over 
three days. Simulation tests documented in [7] indicate that TTT is more sensitive to 
Qm than other model parameters. 


17.4 Traffic State Estimation 


In order to effectively control the on-ramp flows, traffic state information, such 
as vehicle density and the presence or absence of nearby congestion, has to be made 
available to the ramp metering controller. However, cost and other limitations pre- 
vent sensory devices being installed and maintained at all desired locations. There- 
fore, these traffic states must be estimated using the available data. In this section 
we describe a traffic state estimator based on the switching-mode model (SMM), de- 
scribed in Section 17.2.2, and the mixture Kalman filter (MKF) [10], that is capable 
of estimating the vehicle densities at unmeasured locations, as well as determining 
the traffic congestion mode in a freeway section. 


17.4.1 Improved mixture Kalman filter 


In the switching-mode model (SMM) described in Section 17.2.2, there were 
five possible congestion modes. In the remaining sections of this chapter, we further 
simplify this model by considering only two of the five modes, i.e., purely free-flow 
and fully congested, that are most important, while neglecting all other mixed cases, 
such as the mode wherein half the cells in a section are in free-flow and half are 
in congestion. This selection of modes is motivated by the observation that short 
freeway sections tend to spend most of their time in either a free-flow or congested 
condition, with mixed modes being transient. 

The mode (free-flow or congested) is determined by the flow condition in the 
section. However, there is no direct measurement or observation of the current traffic 
congestion mode in a freeway section. The congestion mode can only be inferred 
from measured quantities, such as traffic speed. The general practice in traffic engi- 
neering is to set an upper threshold and a lower threshold for the speed. When the 
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speed in a section is above the upper threshold, the section is considered to be in free 
flow; when the speed is below the lower threshold, the section is in congestion; when 
the speed is between the two thresholds, the section is considered somewhat likely 
to be in congestion. The problem with this kind of method is two-fold: 1) The se- 
lection of the thresholds is based on experience and, to a certain degree, is arbitrary, 
and 2) When the speed falls between the two thresholds, the mode of the section 
cannot be determined. 

Therefore, we assume that we do not have direct observation of the mode and 
that the mode jumps between possible values following a discrete-time Markov chain 
with a certain transition probability. Under these assumptions, the switching-mode 
model falls into a special class, called Markov jump linear systems (MJLS). If only 
the FF and CC modes of Section 17.2.2 are considered, the previous four-cell ex- 
ample has a continuous state x = [p1 p2 p3 pal’ and possible discrete modes 1 
(free-flow) and 2 (congested). 

It is known that it is difficult to estimate the states and the mode when the mode 
itself is not observed. The difficulty lies in the fact that the sample space S* of the 
mode sequence grows exponentially as time t increases, where S is the set of possible 
discrete modes. 

The mixture Kalman filter (MKF) [10] approximately solves this difficult prob- 
ability inference problem by employing a Monte Carlo approach that approximates 
the exponentially growing sample space by a fixed finite number, M, of mode sam- 
ple sequences s, where m = 1,..., M. A weight is associated with each of the 
sample sequences to represent the a posteriori probability of that sample sequence, 


(m) P Ci | yi, ur) 
Ei = M (m) , 
Ema P (a | Ye w 
where a symbol in boldface, for example, s+, represents a sequence from time 0 to 


time t and y; and uz respectively denote the output measurement and control input. 
After the new measurement y¿+1 is available, these weights are updated by 


(17.9) 





(m) e 
Er = M m m)?’ (17.10) 
poe re ie 


where the incremental weight 


ie = p (yer | yn un s)”)), (17.11) 


represents the likelihood of the new measurement for a given mode sample sequence. 

On each of these mode sample sequences, a (time-varying) Kalman filter is im- 
plemented to estimate the continuous states. The state estimates on all mode sample 
sequences are then “mixed” (averaged) by the weights, and this weighted average 
approximates the a posteriori estimate of the continuous states, i.e., 
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Bue = £ a (17.12) 


where cy F m) is the a posteriori state estimate from the m' Kalman filter. 


The accuracy of this Monte Carlo method is improved by a predictive sampling 
technique, in which the current mode is sampled according to a predictive probability 


u (s )ap U = S, Yt+1 af, yr, utega) y (17.13) 


which favors the mode with higher likelihood given the current measurements and 
the previously sampled modes. 

The weight update procedure is recursive. The entire history from time 0 in- 
fluences the current weight. It is often found in implementation that most of these 
weights approach 0 while only a few remain of modest magnitudes. This phe- 
nomenon reduces the effective number of sample sequences and introduces an under- 
flow risk for the weights when implemented on a machine with finite floating point 
precision. Therefore, a forgetting and weight underflow prevention scheme [8] has 
been introduced in our implementation. In this scheme, the weights of the sample 
sequences are bounded from below, 


er) = max yore i (17.14) 


and are re-normalized after the bounding step. This simple procedure not only pre- 
vents the underflow, but in effect limits the influence of the early history on the 
weights and makes the weights recover more quickly when their corresponding sam- 
ple sequences are favored by the current measurements. 

In addition to the mixture estimate of the continuous states, the mixture Kalman 
filter also provides an approximate maximum a posteriori (MAP) estimation of the 
congestion mode: ŝt MAP = arg max p (a= = s | Yt, Ut), (17.15) 


where p(s: = 8 | Ye, tte) >> ei, (s ar (17.16) 


and 1,(s,) is the indicator function. This is particularly important to our application 
because different control schemes will be used in different congestion modes (free- 
flow or congested), based on the freeway density controllability properties [5]. 


17.4.2 Results and discussion 


The mixture Kalman filter based congestion mode and vehicle density estimation 
algorithm was first tested on a two-mile long section of I-210W, from Myrtle to Santa 
Anita, as shown in Fig. 17.5(a). The flow and density measurements at the Myrtle and 
Santa Anita stations were used as feedback to the estimator, while the measurements 
at the Huntington station were assumed to be unavailable and were used to evaluate 
the estimation accuracy. Fig. 17.5(b) shows the estimation results using data from 
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April 10, 2001, with the number of sample sequences M = 10. The first three plots 
show the vehicle density estimation results. In these plots, the solid lines are the 
measured vehicle densities, while the dashed lines are the mixture estimates. The last 
plot shows the MAP congestion mode estimates, where 1 indicates free-flow mode 
and 2 indicates congested mode. It can be seen from the plots that the estimator 
was able to accurately estimate the density at the Huntington station, which was 
not available to the estimator. The mean percentage errors (MPEs), as defined by 


—_l T 
Empe = TAi Žo 
experiment. A complete list of estimation errors with different estimator settings 


and using data from different days can be found in [8]. On average, this estimator 
achieves an MPE of ~10%. 
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Fig. 17.5. Test results of the congestion mode and vehicle density estimator. 


This estimator was also implemented for the entire 14-mile long test segment [9]. 
The program was written in the C language for reasons of efficiency and portability. 
Not only can the estimator run on collected traffic data sets, but it has also been 
successfully interfaced with a calibrated VISSIM microscopic traffic simulator [11], 
through the VISSIM DDE (Direct Data Exchange) interface, and with a calibrated 
macroscopic cell transmission model [6]. The estimator runs synchronously with 
these traffic simulators. The running time of the estimator with 10 mode sample 
sequences for a 7-hour (from 5AM to 12 noon) time period and the full 14-mile 
segment is less than one minute on a 1.4 GHz Pentium M computer. 

The traffic data was extracted from the PeMS [22] database in 2002 and 2003. 
The traffic flows and vehicle occupancies are available every 30 seconds, while the 
speeds and g-factors (estimated average vehicle lengths) are available every 5 min- 
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utes. We interpolated the 30-second flow and density data into 2-second intervals and 
passed the interpolated data through a low-pass filter to reduce the amount of noise 
in the original 30-second traffic data. The estimator produced the vehicle density 
for each cell, as well as the congestion mode for each section, using this 2-second 
interpolated and filtered data. 

Fig. 17.6(b) shows an example of the MAP congestion mode estimation from the 
estimator. In this example, data from January 10, 2002 was used. In the plot, light 
gray indicates free-flow mode, while dark gray indicates congested mode. 

For comparison, a contour plot of the PeMS-derived 15-minute average speeds 
for that day is given in Fig. 17.6(a). In this plot, light gray indicates an average 
speed of 55 miles per hour and above, which is generally considered to indicate 
free-flow conditions in traffic engineering. Dark gray indicates an average speed of 
40 miles per hour and below, in which case the traffic is considered to be in conges- 
tion. Medium gray indicates an average speed between 40 and 55 miles per hour; 
in this range, the traffic is somewhat likely to be in congestion. In the plot, white 
indicates unavailable data. 

It can be seen from the plot that in general, the congestion mode estimation by the 
MKF-based estimator agrees with the speed contour plot. However, the MKF-based 
congestion mode estimation is preferable for the following reasons. 


1. As mentioned earlier, the thresholds for the speed are determined empirically 
and can vary from location to location. 

2. Itis not clear whether a section is in congestion or not when the speed is between 
the upper and lower thresholds. 

3. The speed data usually are not available as frequently as the density and flow 
data when the data are collected using single loop detectors, which is usually the 
case. 

4. More importantly, the MKF-based estimation provides the statistically most 
probable mode that directly corresponds to one of the possible dynamic models, 
while the speed-based estimate itself does not have this direct correspondence. 

5. The MKF-based estimator also provides vehicle density estimation for all the 
cells where no measurements are available. 


17.5 Ramp-Metering Control 


The goal of an on-ramp control system is to improve the efficiency of a free- 
way by regulating the number of vehicles that are allowed to enter through the on- 
ramps, in order to delay the onset of congestion and minimize its duration, and con- 
sequently maximize off-ramp exit flows, while preventing the on-ramp queues from 
spilling over and producing congestion in arterial routes. In this section, we briefly 
review a localized and traffic responsive ramp-metering strategy that we have de- 
veloped [13, 14], which chooses the less restrictive ramp metering rate among the 
following two control systems: a control system that monitors and regulates the free- 
way mainline traffic density in the neighborhood of a ramp merge, and a control 
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(b) MKF maximum a posteriori estimation (Dark gray: con- 
gested; light gray: free-flow). 


Fig. 17.6. Congestion mode estimation for the test segment of Interstate 210 Westbound in 
Pasadena, California (January 10, 2002). 
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system that estimates and regulates the on-ramp queue length, preventing it from 
spilling over into the arterial traffic. This structure was first proposed by Smaragdis 
and Papageorgiou [16]. 


17.5.1 Switching ramp-metering mainline density control 


As discussed in Sections 17.2.2 and 17.4.1, the traffic dynamics in a freeway 
section are different under different congestion conditions—free-flow or congested. 
Under free-flow conditions, unmeasured mainline densities in a section can only 
be estimated using a downstream measurement, while on-ramp flows affect down- 
stream mainline traffic densities. On the other hand, under congested conditions, 
unmeasured mainline densities in a section can only be estimated using an upstream 
measurement, while on-ramp flows affect upstream mainline traffic densities. It is 
therefore advantageous to change the structure of the localized controller to suit the 
current controllability and observability properties of the mainline segment near the 
on-ramp merge, as illustrated in Fig. 17.7. Moreover, since most California freeways 








(a) Free-flow mode. (b) Congested mode. 


Fig. 17.7. Different control structures for different congestion modes. 


only have mainline loop detectors located upstream of the on-ramp, as depicted in 
Fig. 17.7, mainline densities in sections that are downstream of the on-ramp, such as 
the mainline density p2 in Fig. 17.7(a), must be estimated. The mixture Kalman filter 
(MKF) based traffic state estimator that we have developed [8, 9] is used to estimate, 
in real time, the most probable congestion mode and the cell vehicle densities in a 
freeway section. The estimated congestion mode is used to determine the appropriate 
control structure, and the estimated vehicle densities are used as feedback. 

To compensate for disturbances and to accommodate the difference between 
the model sampling time and the metering-rate update interval, a multirate linear 
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quadratic control with integral action (multirate LQI) approach [13] was used to 
synthesize the ramp-metering controller for both of the congestion modes. 
In either mode, the desired metering rate is first calculated using 


re(t)=r(t-1)- K(t) ?| (t), (17.17) 
and then saturated using 
re(t) = min{ Prva, MAK Tins re(t)}}, (17.18) 


where r(t) is the actual ramp flow measured by the entrance loop-detector, as shown 
in Fig. 17.8, A(t) is the mainline density error (p — p for a desired density p(t)), 
z(t) = p(t)—Ap(t—1), and rmax and rmin are the established maximum and minimum 
metering rates. There is an anti-windup scheme in (17.17) that is similar to what is 
used in ALINEA [12] to address the metering-rate saturation problem. 
In (17.17), 
K(t) = fa when t = np for some n € Z, (17.19) 
0, when t Æ np for any n € Z, 


where p is the ratio between the metering-rate update interval and the model sam- 
pling time, and K, is determined by solving a periodic Riccatti equation. See [13] 
for details. 


17.5.2 Queue length regulation 


A typical configuration of loop detectors and signals on an on-ramp on a Cal- 
ifornia freeway is shown in Fig. 17.8. To prevent the on-ramp queue from spilling 
over into surface streets and interfering with the street traffic, the queue length must 
be regulated. The “queue-override” scheme currently used on California freeways 
steadily increases the metering rate (e.g., 120 vehicles per hour per lane every 30 sec- 
onds) whenever the end of the queue reaches the queue detector, until the metering 
rate saturates to the maximum value. After the queue dissipates and recedes behind 
the queue detector location, the metering rate is reset to the value determined by the 
mainline traffic responsive metering controller. This scheme is equivalent to an inte- 
gral control with a saturated integrating rate and resetting. It can be easily shown that 
the resulting closed loop dynamics is not asymptotically stable, given that the open 
loop queue length dynamics is that of a simple integrator. It has been noted [15, 16] 
that this queue-override scheme leads to oscillatory behavior and under-utilization 
of on-ramp storage capacities. In [16], Smaragdis and Papageorgiou proposed a pro- 
portional controller that relies on the on-ramp vehicle demands. However, real-time 
demand measurements are generally not available in the field, and such a control 
scheme would have to instead rely on historical demands. 

If the queue length /(t) could be measured, an asymptotically stable PI-controller 
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Fig. 17.8. A typical configuration of loop detectors and signals on an on-ramp. 





k as 
r-(z) = (ip + L) i(z) (17.20) 
z— 
would be able to regulate the queue length precisely at a specified value. This con- 
troller can be designed by choosing proper gains kp and kz, using the root-locus 
method on the closed-loop sensitivity function from the disturbance to the error, 
which is given by 
(z) T(z — 1) 


lo C-D krii eo (17.21) 





where I(t) is the queue length error, and d(t) is the vehicle arrival rate (the demand), 
which is regarded as a disturbance. 

The anti-windup and saturating mechanisms in (17.17) and (17.18) also need to 
be implemented in this queue length regulator. 


17.5.3 On-ramp queue length estimation 


Though it has a more stable response than the queue-override scheme, the PI 
regulator described in Section 17.5.2 needs the current queue length as its feedback, 
which unfortunately is not presently available in the field. A suitable estimator has to 
be designed using available information, such as the speed of the vehicles entering 
the on-ramp, as measured by the queue detector [14]. 

We assume the following simplified driving behavior model for a vehicle ap- 
proaching the end of the queue: the vehicle decelerates at a constant rate, —a, from 
its cruising speed to a target speed vo, which is achieved at the position where the 
distance from the end of the queue is s. We also assume a uniform effective vehicle 
length g. Let lo be the number of vehicle spaces from the stop line to the queue de- 
tector and u(t) be the vehicle speed measured by the queue detector, as depicted in 
Fig. 17.9. 

A straightforward kinematic calculation yields 

v(t)? — ve 


g(lo — I(t)) — s = 5 (17.22) 





where l(t) is the current queue length, in number of vehicles. From (17.22), we 
obtain 
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Fig. 17.9. A schematic for on-ramp queue length estimation. 
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gl(t) = glo — s + = co — cav (t). (17.23) 
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Fig. 17.10. A scatter plot of queue lengths vs. queue detector speeds and the least median-of- 
squares curve fit for one of the on-ramps. 


To determine the coefficients cg and c2 in (17.23), a curve fitting was performed 
on the gi(t) and v(t) data collected using the VISSIM [23] microscopic traffic sim- 
ulator. Fig. 17.10 shows a typical scatter plot of queue lengths versus speeds. A few 
points need to be noted: 


1. When the queue is shorter than a certain length, the approaching vehicles pass 
the queue detector at the drivers’ desired cruising speeds, which are indepen- 
dent of the queue length. This phenomenon corresponds to the data points at the 
lower-right corner of the scatter plot. 

2. When the queue is longer than glo, i.e., the queue has extended beyond the queue 
detector, the measured speed is also a constant, which is related to the queue 
discharging rate and the vehicle lengths, and is also independent of the queue 
length. This phenomenon corresponds to the data points at the upper-left corner 
of the scatter plot. 
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3. There are many outliers among the data points. Therefore, the usual least-squares 
curve fitting method, which is biased toward outliers, is not suitable. 


For these reasons, we neglected the data points whose speeds are below Vmin or 
above Umax and those whose queue lengths are below ly); or above lmax in the curve 
fitting. These values were determined by visual inspection of the scatter plots. 

To increase robustness to outliers, we used the least median-of-squares [24] curve 
fitting method, instead of the usual least (sum-of-)squares. The fitted curve is also 
shown in Fig. 17.10. 

After the /—v curve is fitted for each on-ramp, the difference between the actual 
and desired queue length, which is used as the feedback to the regulator (17.20), is 
estimated as 


iW) = E Si-a o > tain dra 


—kcg (u(t)? m ae FCF if v(t) < Umin> 


where k is a tuning parameter. 

When v(t) < Vmin, the end of the queue is very close to or beyond the queue 
detector, and the speed v(t) measured by the queue detector is a constant, which is 
roughly gre. Therefore, (17.24) can be thought of as a method for saturating i to 
—keg((gre)? — v2\in) /g, which is larger when the metering rate re is lower. This 
has a desirable effect on the regulator: The metering rate re will be increased more 
aggressively when there is more room for this increase, and more slowly when re is 
close to its maximum value. In addition, this saturation value can be further tuned by 
changing the value of k. 

It is also worth mentioning that the coefficients cg and cz identified by the least 
median-of-squares fitting are very close to the nominal values predicted by using the 
actual distance between the stop-line and the queue detector and a nominal vehicle 
deceleration of 2.5 m/s”. Therefore, when the queue length measurements are un- 
available through any means to perform a curve fitting, these nominal values can be 
used in the queue length estimation. 


17.5.4 A localized and traffic responsive on-ramp metering control strategy 


Localized traffic responsive strategies are desired for reasons including reduced 
algorithmic complexity, lower computational requirements, and higher robustness to 
changing traffic conditions such as unpredicted demands. We have proposed a local- 
ized metering strategy and tested it on a calibrated macroscopic traffic model [13]. It 
is described as follows: 


1. The set-point for the switching mainline traffic responsive ramp-metering con- 
troller is chosen to be the critical density, i.e., the density at which congestion is 
about to form. This is adopted to slow down congestion shock waves propagat- 
ing in the upstream direction and to speed up congestion shock waves moving 
downstream. 
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2. The set-point of the queue length regulator is the maximum allowed queue 
length. This value is chosen to fully utilize the available storage capacity on 
the ramp and to deter short-trip travelers from using the freeway, thus making 
the freeway capacity available to longer-distance travelers. 

3. The higher of the two rates determined by the mainline traffic responsive meter- 
ing controller and the queue length regulator is chosen to be the actual metering 
rate that is sent to the signal control box. This rule, which was first proposed by 
Smaragdis and Papageorgiou [16], is designed to properly resolve the conflict 
between the objectives of these two controllers. 


17.5.5 Performance measures 


Before presenting results, we first define some performance measures for quan- 
titative evaluation of a given freeway segment. All the quantities are defined for the 
time period T and the freeway segment L. 


Dy tot Total Vehicle Distance, which is defined as the sum of the distances traveled 
by all the vehicles in L within T. 

Ty ,tot Total Vehicle Time, which is the sum of the time that is spent by all vehicles 
in L within T. It includes the time spent by vehicles waiting in the on-ramp 
queues. 

DLy tot Total Vehicle Delay (also known as Congestion Delay), which is the differ- 
ence between the Total Vehicle Time and the time that would be spent by all the 
vehicles if there were no congestion. DLy tot = Tv tot — Dv ,tot/vo, where vo 
is the nominal free-flow speed. 

Vy ,tot Average Total Vehicle Speed vy tot = Dv tot /Tv tot: 

Vy,m1 Average Mainline Vehicle Speed, which is similar to Uy tot but calculated 
using only data from the mainline portion of the freeway. 


Another set of passenger-weighted performance measures can be defined by 
first computing the traffic quantities separately for the low- or high-occupancy 
vehicle classes, and then, during the performance-measure calculation, weight- 
ing these quantities by the average passenger number in each vehicle class. This 
set of passenger-weighted performance measures includes Total Passenger Dis- 
tance Dp tot, Total Passenger Time Tp tot, Total Passenger Delay DLp tot, Average 
Total Passenger Speed vp tot, and Average Mainline Passenger Speed Up mi. 


17.5.6 Results 


The switching mainline traffic responsive metering controller and the queue 
length regulator were implemented and interfaced with the VISSIM microscopic traf- 
fic simulation model that has been calibrated to the I-210W test segment in [11]. The 
localized control strategy described in Section 17.5.4 was used. Fig. 17.11 shows the 
congestion patterns, as determined by the MKF traffic state estimator [8, 9], before 
and after ramp metering. In the plots, dark gray indicates congested mode and light 
gray free-flow. The vertical axis is time, from 5:30 to 11:00 in the morning. The 
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horizontal axis is the mile post along the freeway, and the traffic travels from left to 
right. It can be seen that the localized ramp metering strategy was able to reduce the 
congestion, in terms of both the spatial span and the time duration. 
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(a) Without ramp metering. 
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(b) With ramp metering. 


Fig. 17.11. Congestion modes for the I-210W test segment under different metering scenarios, 
light gray: free-flow, and dark gray: congested. 
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For these tests, the parameters in the multirate LQI design were set as follows: 
Q = I, R = 5 for the free-flow mode, and Q = I, R = 20 for the congested mode, 
with gains kp = kr = 120 in the queue length regulator. 

We also implemented a modified version of ALINEA [12] and combined it with 
the queue length estimator and regulator that we have presented in Sections 17.5.3 
and 17.5.2. In this modified ALINEA, we used the occupancy data measured up- 
stream of the on-ramps, instead of using occupancies measured downstream of the 
on-ramps (as originally recommended for ALINEA), in order to better reflect the 
loop-detector configuration typically available on California freeways. In [25] it 
has been shown, using the calibrated I-210WŴ VISSIM model, that this modified 
ALINEA can achieve comparable and sometimes even better performance, when 
compared to the original ALINEA. The optimal ALINEA gain (7000) and set-point 
(27.2%) found in [25] were used in our simulations. 

Different ramp-metering algorithms, including 1) switching LQI plus queue reg- 
ulation, 2) switching LQI only, without queue regulation, and 3) ALINEA plus queue 
regulation, were tested with the I-210W VISSIM model. Under each scenario, 8 sim- 
ulation runs were carried out, with 8 different VISSIM random seeds. The random 
seed was chosen to be the second of the computer clock at the time when it was 
changed, to ensure its randomness. 

Some of the performance measures for this freeway segment, as defined in Sec- 
tion 17.5.5, are listed in Table 17.3. The listed numbers are the averages from the 
8 simulation runs for each scenario. In calculating these quantities, the average pas- 
senger numbers per one low- and high-occupancy vehicle are assumed to be 1.2 and 
2.5, respectively, and the nominal free-flow speed vo is 63 miles per hour. 

It should be noted that the I-605 interchange into I-210 cannot be metered and 
provides a large volume of inflow traffic into the upstream portion of I-210W (on- 
ramp number 6 in Fig. 17.3). Consequently, congestion takes place on I-210W, even 
when all metered on-ramp queues are allowed to overflow into arterial streets. 

Under all the scenarios, the freeway segment served almost the same amount of 
demand, as measured by the Total Vehicle Distance Dy tot or Total Passenger Dis- 
tance Dp tot- Ramp-metering was able to reduce the congestion under all the metered 
scenarios. For example, with the switching LQI mainline control and queue length 
regulation, the Total Vehicle Delay (also known as Congestion Delay) DLy tot was 
reduced by 16%, while with the switching LQI mainline control only, DLy to¢ was 
reduced by 20%. 

When only the switching mainline traffic responsive metering was used, with- 
out activating the queue length regulator, on-ramp queues could grow to arbitrary 
lengths, sometimes hundreds of vehicles. In this case, almost all the congestion on 
the mainline was eliminated, as evidenced by the average mainline vehicle speed 
Uv ml, Which was 55.8 mph. Another interesting phenomenon in this case is that the 
relative improvements in terms of passenger-weighted performance measures were 
greater than those in terms of vehicle performance measures. This is because many 
of the metered on-ramps on this freeway segment have designated lanes for HOVs to 
bypass the long queues. 
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Table 17.3. Performance measures for the I-210 test segment under different ramp-metering 
algorithms. Q/R means queue estimation and regulation. 



























































Dy tot TV ,tot DLy tot Uv ,tot | UV ml 

(103 mile) | (10° hour) | (103 hour) | (mph) | (mph) 

No metering 973 24.0 8.52 40.6 37.8 
Switching + Q/R 972 22.6 7.19 43.0 40.8 
Improvement - 5.6% 16% 5.8% | 7.9% 
Switching 974 22.3 6.81 43.7 55.8 
Improvement - 7.1% 20% 7.7% | 47% 
ALINEA + Q/R 974 23.5 8.01 41.5 39.2 
Improvement = 2.0% 5.8% 2.2% | 3.6% 
DP tot TP tot DLP tot UP tot | UP,ml 
(10° mile) | (103 hour) | (10? hour) | (mph) | (mph) 

No metering 1.32 31.7 10.8 41.6 39.4 
Switching + Q/R 1.32 29.9 9.0 44.0 42.4 
Improvement - 5.5% 16% 5.7% | 74% 
Switching 1.32 29.3 8.3 45.1 56.4 
Improvement - 7.7% 23% 8.4% | 43% 
ALINEA + Q/R 1.32 31.0 10.1 42.5 40.7 
Improvement - 2.1% 6.3% 2.3% | 3.4% 























It can also be seen from the numbers in Table 17.3 that the switching control 
algorithm outperforms the modified ALINEA, when both algorithms are combined 
with the queue length estimator and regulator. 


17.6 Conclusions 


In this chapter, we first presented a macroscopic freeway traffic model, the 
Switching-Mode Model (SMM), which is a piecewise-linearized version of Da- 
ganzo’s CTM [3, 4]. The SMM is computationally efficient and well-suited for im- 
plementation in real-time control, estimation, and traffic monitoring applications. 
The observability and controllability properties of the individual modes of the SMM, 
which are of fundamental importance in the design of data estimators and ramp- 
metering control systems, were stated. It was revealed that the free-flow traffic mode 
is observable from a downstream measurement and controllable from an upstream 
on-ramp, and that the congested mode is observable from an upstream measurement 
and controllable from a downstream on-ramp. 

A procedure for calibrating the CTM and SMM parameters was summarized. A 
calibrated CTM model was tested on the full 14-mile test section of I-210W, and has 
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been shown to reproduce the main features of the observed traffic congestion on the 
freeway, such as approximate location of bottlenecks and duration and spatial extent 
of congestion. In addition, the model accurately predicts the total travel time (TTT) 
in the freeway. A main benefit of the overall calibration method is that it provides 
a well-defined, automatable procedure for using loop detector data to estimate free- 
flow speeds, congestion parameters, and bottleneck capacities for the CTM. 

A congestion mode and vehicle density estimator was designed and implemented 
on the I-210W test segment. Using the mixture Kalman filtering (MKF) algorithm on 
the switching-mode traffic model, the estimator is able to provide the estimated vehi- 
cle densities at unmeasured locations, as well as the most probable traffic congestion 
modes (free-flow or congested), which are not directly observed. The test results on a 
short freeway section show that on average, a mean percentage error of about 10% in 
density estimation is achieved and the performance is consistent over different days. 
The algorithm approximately maintains its performance even with a relatively small 
number of sample sequences and runs efficiently, thus making it possible to carry out 
estimation in real time. The availability of the congestion modes enables us to design 
more effective ramp metering algorithms, utilizing the appropriate switching-mode 
model dynamics under different flow conditions. 

We also presented a localized ramp-metering strategy that achieves the control 
goal of reducing the spatial and temporal extent of the congestion, using locally 
available information. This control strategy works with a switching mainline traf- 
fic responsive ramp-metering controller that adapts to the different traffic dynamics 
under different congestion conditions, and a PI queue length regulator that yields im- 
proved performance over the currently used “queue-override” scheme and keeps the 
queue under the ramp storage capacity limit. In addition, a queue length estimator 
was designed to provide feedback to the queue length regulator, using the queue- 
detector speed data that are available in the field. Test results on the calibrated VIS- 
SIM I-210W microscopic model demonstrated the performance and effectiveness of 
the switching ramp-metering controller, the queue length estimator and regulator, 
and the overall control strategy. The Total Vehicle and Passenger Delays were both 
reduced by 16%, while the Total Vehicle Time and the Total Average Vehicle Speed 
were improved by 5.6% and 5.8%. As a comparison, simulation results of ALINEA 
were also presented. The switching mainline traffic responsive control was able to 
outperform ALINEA, when both algorithms were combined with the same queue 
length estimator and regulator. 
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